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DESCRIPTION 



Development of Novel Anti-Microbial Agents Based on Bacteriophage Genomics 

5 BACKGROUND OF THE INVENTION 

The present invention relates to the field of antibacterial agents and the 
treatment of infections of animals or other complex organisms by bacteria. 

1 0 The frequency and spectrum of antibiotic-resistant infections have, in recent 

years, increased in both the hospital and community. Certain infections have become 
essentially unbeatable and are growing to epidemic proportions in the developing 
world as well as in institutional settings in the developed world. The staggering 
spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial 

15 genetic characteristics, widespread use of antibiotic drugs, and changes in society that 
enhance the transmission of drug-resistant organisms. This spread of drug resistant 
microbes is leading to ever increasing morbidity, mortality and health-care costs. 

Ironically, it is the very success of antibiotics, resulting in their widespread 
use, that has contributed the most to rising numbers of drug resistant bacterial strains. 

20 The longer a bacterial strain is exposed to a drug, the more likely it is to acquire 
resistance. Today, a total of 160 antibiotics, all based on a few basic chemical 
structures and targeting a small number of metabolic pathways, have found their way 
to market. Over-prescription of these drugs, as well as the failure of patients to 
comply with the complete antibiotic regimen, has lead to the rapid emergence of 

25 antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in 
virtually all commercial production of beef and fowl, and changing societal 
conditions, such as the growth of day-care centers, increased long-term care in 
hospitals, and increased mobility of the population, has provided an environment 
where drug-resistant microbes can emerge and spread. Thus, virtually all common 

30 infectious bacteria are becoming, or have already become, resistant to one or more 
groups of antibiotics. Such resistance now reaches all classes of antibiotics currently 
in use, including: P-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, 
chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and 
mupirocin. 

35 Over the last 45 years bacteria have adapted genetically to avoid the 

destruction/alteration of the essential pathways that these chemotherapeutic agents 
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target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the 
rate at which new antibiotics are being developed. The consequence of this dilemma 
has been a dramatic increase in the cost of treating infections what would otherwise 
easily succumb to routine antibiotic therapy. Furthermore, and perhaps most 
5 importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a 
significant increase in morbidity and mortality, particularly in institutional settings. 

Most major pharmaceutical companies have on-going drug discovery 
programs for novel anti-microbials. These are based on screens for small molecule 
inhibitors (natural products, bacterial culture media, libraries of small molecules, 

10 combinatorial chemistry) of crucial metabolic pathways of the micro-organism of 
interest (e.g., bacteria, fungi, parasites, worms). The screening process is largely for 
cytotoxic compounds and in most cases is not based on a known mechanism of action 
of the compounds. Pharmaceutical companies have large programs in this area. 
Classical drug screening programs are being exhausted and many of these 

15 pharmaceutical companies are looking towards rational drug design programs. 

Several small to mid-size biotechnology companies as well as large 
pharmaceutical companies have developed systematic high-throughput sequencing 
programs to decipher the genetic code of specific micro-organisms of interest. The 
goal is to identify, through sequencing, unique biochemical pathways or intermediates 

20 that are unique to the microorganism. Knowledge of this may, in turn, form the 
rationale for a drug discovery program based on the mechanism of action of the 
identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome 
Research, Human Genome Sciences Inc., and other companies have such sequencing 
programs in place. However, one of the most critical steps in this approach is the 

25 ascertainment that the identified proteins and biochemical pathways are 1) non- 
redundant and essential for bacterial survival, and 2) constitute suitable and accessible 
targets for drug discovery. 
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SUMMARY OF THE INVENTION 



While animals such as humans are, on occasion, infected by pathogenic 
bacteria, bacteria also have natural enemies. A number of host-specific viruses, 
5 known as bacteriophages or phages, infect and kill bacteria in the natural 

environment. Such bacteriophages generally have small compact genomes and 
bacteria are their exclusive hosts. Many known bacteria are host to a large number of 
bacteriophages that have been described in the literature. During the 1940's - 1960's, 
phage biology was an area of active research. As a testimony to this, the study of 

10 phages which infect and inhibit the enteric bacterium Escherichia coli {E. coli) 
contributed much to the early understanding of molecular biology and virology. 

As is generally understood, bacteriophage (or phages) are viruses that infect 
and kill bacteria. They are natural enemies of bacteria and, over the course of 
evolution, have developed proteins (products of DNA sequences) which enable them 

15 to infect a host bacteria, replicate their genetic material, usurp host metabolism, and 
ultimately kill their host. The scientific literature well documents the fact that many 
known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 
1987) that can infect and kill them (for example, see the ATCC bacteriophage 
collection at http://www.atcc.org). 

20 This invention utilizes the observation that bacteriophages successfully infect 

and inhibit or kill host bacteria, targeting a variety of normal host metabolic and 
physiological traits, some of which are shared by all bacteria, pathogenic and 
nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to 
or implication in disease or a morbid state of an infected organism. The invention 

25 thus involves identifying and elucidating the molecular mechanisms by which phages 
interfere with host bacterial metabolism, an objective being to provide novel targets 
for drug design. Whether the phage blocks bacterial RNA transcription or translation, 
or attacks other important metabolic pathways, such as cell wall assembly or 
membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is 

30 encoded in its genome and can be unlocked using bioinformatics, functional 
genomics, and proteomics. By these means, the invention utilizes sequence 
information from the genomics of bacteriophage to identify novel antimicrobials that 
can be further used to actively and/or prophylactically treat bacterial infection. 

Two important components of the invention thus are: i) the identification of 

35 bacteria-inhibiting phage open reading frames ("ORF"s) and corresponding products 
that can be used to develop antibiotics based on amino acid sequence and secondary 
structural characteristics of the ORF products, and ii) the use of bacteriophages to map 
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out essential bacterial target genes and homologs, which can in turn lead to the 
development of suitable anti-microbial agents. These two avenues represent new and 
general methods for developing novel antimicrobials. 

The invention thus concerns the identification of bacteriophage ORFs that 
5 supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit", 
"inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a 
biological activity or function. Such reduction in activity or function can, for 
example, be in connection with a cellular component, e.g., an enzyme, or in 
connection with a cellular process, e.g., synthesis of a particular protein, or in 

10 connection with an overall process of a cell, e.g., cell growth. In reference to bacterial 
cell growth, for example, an inhibitory effect (i.e., a bacteria-inhibiting effect) may be 
bacteriocidal (killing of bacterial cells) or bacteriostatic (i.e., stopping or at least 
slowing bacterial cell growth). The latter slows or prevents cell growth such that 
fewer cells of the strain are produced relative to uninhibited cells over a given period 

15 of time. From a molecular standpoint, such inhibition may equate with a reduction in 
the level of, or elimination of, the transcription and/or translation of a specific 
bacterial target(s), or reduction or elimination of activity of a particular target 
biomolecule. 

It is particularly advantageous to evaluate a plurality of different phage ORFs 

20 for inhibitory activity that may be from one, but is preferably from a plurality of 

different phage. For example, evaluating ORFs from a number of different phage of 
the same bacterial host provides at least two advantages. One is that the multiple 
phages will provide identification of a variety of different targets. Second, it is likely 
that multiple phage will utilize the same cellular target 

25 As used herein, the terms "bacteriophage" and "phage" are used 

interchangeably to refer to a virus which can infect a bacterial strain or a number of 
different bacterial strains. 

In the context of this invention, the term "bacteriophage ORF" or ""phage 
ORF" or similar term refers to a nucleotide sequence in or from a bacteriophage. In 

30 connection with a particular ORF, the terms refer an open reading frame which has at 
least 95% sequence identity, preferably at least 97% sequence identity, more 
preferably at least 98% sequence identity with an ORF from the particular phage 
identified herein (e.g., with an ORF as identified herein) or to a nucleic acid sequence 
which has the specified sequence identify percentage with such an ORF sequence. 

35 A first aspect of the invention thus provides a method for identifying a 

bacteriophage nucleic acid coding region encoding a product active on an essential 
bacterial target by identifying a nucleic acid sequence encoding a gene product which 
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provides a bacteria-inhibiting function when the bacteriophage infects a host 
bacterium, preferably one that is an animal or plant pathogen, more preferably a bird 
or mammalian pathogen, and most preferably a human pathogen. The bacteriophage 
is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage 
5 X, <}>xl74, m!3 and other £.co//-specific bacteriophage that have been studied with 
respect to gene number and/or function. It also excludes, for example, the nucleic 
acid coding regions described in Tables 12-14, and in preferred embodiments, 
excludes the phage in which those regions are naturally located. 

In connection with bacteriophage, the term "uncharacterized" means that a 
10 certain bacteriophage's genome has not yet been fully identified such that the genes 
having function involved in inhibiting host cells have not been identified. In 
particular, phage for which the description of genomic or protein sequence was first 
provided herein are uncharacterized. Phage sequences for which host bacteria- 
inhibiting functions have been identified prior to the filing of the present application 

15 (or alternatively prior to the present invention) are specifically excluded from the 

aspects involving utilization of sequences from uncharacterized bacteriophage, except 
that aspects may involve a plurality of phage where one or more of those phage are 
uncharacterized and one or more others have been characterized to some extent. A 
number of different bacteria-inhibiting phage ORFs are indicated in Tables 1 1-14. 

20 The phage ORFs or sequences identified therein are not within the term 

"uncharacterized; alternatively, in preferred embodiments the phage containing those 
ORFs are excluded from this term. Further, any additional phage ORFs (or 
alternatively the phage which contain those ORFs) which have previously been 
described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or 

25 phage are known to those skilled in the art and the exclusion can be made express by 
specifically naming such ORFs or phage as needed (likewise for uncharacterized 
targets as described below). For the sake of brevity, such a listing is not expressly 
presented, as such information is readily available to those skilled in the art. 

Stating that an agent or compound is "active on" a particular cellular target, 

30 such as the product of a particular gene, means that the target is an important part of a 

cellular pathway which includes that target and that the agent acts on that pathway. 

Thus, in some cases the agent may act on a component upstream or downstream of the 

stated target, including on a regulator of that pathway or a component of that pathway. 
By "essential", in connection with a gene or gene product, is meant that the host 

35 cannot survive without, or is significantly growth compromised, in the absence 

depletion, or alteration of functional product. An "essential gene" is thus one that 

encodes a product that is beneficial, or preferably necessary, for cellular growth in 
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vitro in a medium appropriate for growth of a strain having a wild-type allele 
corresponding to the particular gene in question. Therefore, if an essential gene is 
inactivated or inhibited, that cell will grow significantly more slowly, preferably less 
than 20%, more preferably less than 10%, most preferably less than 5% of the growth 
5 rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in 
the absence of activity provided by a product of the gene, the cell will not grow at all 
or will be non-viable, at least under culture conditions similar to the in vivo conditions 
normally encountered by the bacterial cell during an infection. For example, absence 
of the biological activity of certain enzymes involved in bacterial cell wall synthesis 
10 can result in the lysis of cells under normal osmotic conditions, even though 
protoplasts can be maintained under controlled osmotic conditions. In the context of 
the invention, essential genes are generally the preferred targets of antimicrobial 
agents. Essential genes can encode target molecules directly or can encode a product 
involved in the production, modification, or maintenance of a target molecule. 

15 A "target" refers to a biomolecule that can be acted on by an exogenous agent, 

thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases 
such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein. 
However, other types of biomolecules can also be targets, e.g., membrane lipids and 
cell wall structural components. 

20 The term "bacterium" refers to a single bacterial strain, and includes a single 

cell, and a plurality or population of cells of that strain unless clearly indicated to the 
contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria 
or phage having a particular genetic content. The genetic content includes genomic 
content as well as recombinant vectors. Thus, for example, two otherwise identical 

25 bacterial cells would represent different strains if each contained a vector, e.g., a 
plasmid, with different phage ORF inserts. 

In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3 A, 
96, or 44 AHJD, Enterococcus sp. phage 182, or Streptococcus pneumoniae phage 
Dp-1. 

30 In preferred embodiments, the phage is selected from. Preferred embodiments 

involve expressing at least one recombinant phage ORF(s) in a bacterial host followed 
by inhibition analysis of that host. Inhibition following expression of the phage ORF 
is indicative that the product of the ORF is active on an essential bacterial target. 
Such evaluation can be carried out in a variety of different formats, such as on a 

35 support matrix such as a solidified medium in a petri dish, or in liquid culture. 
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Preferably a plurality of phage ORFs are expressed in at least one bacterium. The 
plurality of phage ORFs can be from one or a plurality of phage. With respect to a 
single phage or at least one phage in a plurality of phages, the plurality of expressed 
ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or 60%, 
5 still more preferably at least 80% or 90%, and most preferably at least 95% of the 
ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of 
expressed ORFs preferably represents at least 10%, more preferably at least 20%, 
40%, or 60%, still more preferably at least 80% or 90%, and most preferably at least 
95% of the ORFs in the phage genome of each phage. The plurality of phage ORFs 

10 can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is 
expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are 
expressed in at least one or in all of the plurality of bacteria, or combinations of these. 

In embodiments of the above aspect (as well as in other aspects herein) in 
which a plurality of phage are utilized, a plurality of phage have the same bacterial 

15 host species; have different bacterial host species; or both. The plurality of phage 
includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more 
different phage. Indeed, more preferably, the plurality of phage will include 50, 75, 
100, or more phage. As described herein, the larger number of phage is useful to 
provide additional target and target evaluation information useful in developing 

20 antibacterial agents, for example, by providing identification of a larger range of 
bacterial targets, and/or providing further indication of the suitability of a particular 
target (for example, utilization of a target by a number of different unrelated phage 
can suggest that the target is particularly stable and accessible and effective) and/or 
can indicate alternate sites on a target which interact with different inhibitors. 

25 Further embodiments involve confirmation of the inhibitor function of the 

phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the 
inhibitory nature of the ORF(s) being evaluated. The control can, for example, be 
provided by expression of an inactive or partially inactive form of the ORF or ORF 
product, and/or by the absence of expression of the ORF or ORF product in the same 

30 or a closely comparable bacterial strain as that used for expression of the test ORF. 
The reduced level of activity or the absence of active ORF product in the control will 
thus not provide the inhibition provided by a corresponding inhibitory ORF, or will 
provide a distinguishably lower level of inhibition. An inactivated or partially 
inactivated control has a mutation(s), e.g., in the coding region or in flanking 

35 regulatory elements, that reduce(s) or eliminate(s) the normal function of the ORF. 
Thus, the inhibition of a bacterium following expression of a phage ORF is 
determined by comparison with the effects of expression of an inactivated ORF or the 
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response of the bacteria in the absence of expression in the same or similar type 
bacterium. Such determination of inhibition of the bacterium following expression of 
the ORF is indicative of a bacteria-inhibiting function. These manipulations are 
routinely understood and accomplished by those of skill in the art using standard 
techniques. In embodiments utilizing absence of expression of the ORF, the bacteria 
can, for example, contain an empty vector or a vector which allows expression of an 
unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria 
may have no vector at all. Combinations of such controls or other controls may also 
be utilized as recognized by those skilled in the art. 

In embodiments involving expression of a phage ORF in a bacterial strain, in 
preferred embodiments that expression is inducible. 

By "inducible" is meant that expression is absent or occurs at a low level until 
the occurrence of an appropriate environmental stimulus provides otherwise. For the 
present invention such induction is preferably controlled by an artificial 
environmental change, such as by contacting a bacterial strain population with an 
inducing compound (i.e., an inducer). However, induction could also occur, for 
example, in response to build-up of a compound produced by the bacteria in the 
bacterial culture, e.g., in the medium. As uncontrolled or constitutive expression of 
inhibitory ORFs can severely compromise bacteria to the point of eradication, such 
expression is therefore undesirable in many cases because it would prevent effective 
evaluation of the strain and inhibitor being studied. For example, such uncontrolled 
expression could prevent any growth of the strain following insertion of a 
recombinant ORF, thus preventing determination of effective transfection or 
transformation. A controlled or inducible expression is therefore advantageous and is 
generally provided through the provision of suitable regulatory elements, e.g., 
promoter/operator sequences that can be conveniently transcriptionally linked to a 
coding sequence to be evaluated. In most cases, the vector will also contain 
sequences suitable for efficient replication of the vector in the same or different host 
cells and/or sequences allowing selection of cells containing the vector, i.e., 
"selectable markers." Further, preferred vectors include convenient primer sequences 
flanking the cloning region from which PCR and/or sequencing may be performed. 

As knowledge of the nucleotide sequence of phage ORFs is useful, e.g., for 
assisting in the identification of phage proteins active against essential bacterial host 
targets, preferred embodiments involve the sequencing of at least a portion of the 
phage genome in combination with the above methods. This can be done either before 
or after or independent of expression and inhibition of the ORF in the bacteria, and 
provides information on the nature and characteristics of the ORF. Such a portion is 
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preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For 
embodiments in which a plurality of phage are utilized, preferably each phage is 
sequenced to an extent as just specified. 

Such sequencing is preferably accompanied by computer sequence analysis to 
5 define and evaluate ORF(s), ORF products, structural motifs or functional properties 
of ORF products, and/or their genetic control elements. Thus, certain embodiments 
incorporate computer sequence analyses or nucleic acid and/or amino acid sequences. 
Further, existing data banks can provide phage sequence and product information 
which can be utilized for analysis and identification of ORFs in the sequence. 
1 0 Computer analysis may further employ known homologous sequences from other 
species that suggest or indicate conserved underlying biochemical fiinction(s) for .the 
inhibitory or potentially inhibitory ORF sequence(s) being evaluated. This can 
include the sequences of signature motifs of identified classes of inhibitors. 

In the context of the phage nucleic acid sequences, e.g., gene sequences, of this 
15 invention, the terms "homolog" and "homologous" denote nucleotide sequences from 
different bacteria or phage strains or species or from other types of organisms that 
have significantly related nucleotide sequences, and consequently significantly related 
encoded gene products, preferably having related function. Homologous gene 
sequences or coding sequences have at least 70% sequence identity (as defined by the 
20 maximal base match in a computer-generated alignment of two or more nucleic acid 
sequences) over at least one sequence window of 48 nucleotides, more preferably at 
least 80 or 85%, still more preferably at least 90%, and most preferably at least 95%. 
The polypeptide products of homologous genes have at least 35% amino acid 
sequence identity over at least one sequence window of 18 amino acid residues, more 
25 preferably at least 40%, still more preferably at least 50% or 60%, and most 
preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is 
also a functional homolog, meaning that the homolog will functionally complement 
one or more biological activities of the product being compared. For nucleotide or 
amino acid sequence comparisons where a homology is defined by a % sequence 
30 identity, the percentage is determined using BLAST programs ( with default 
parameters (Altschul et al„ 1997, "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs, Nucleic Acid Res. 25:3389-3402). 
Any of a variety of algorithms known in the art which provide comparable results can 
also be used, preferably using default parameters. Performance characteristics for 
35 three different algorithms in homology searching is described in Salamov et al., 1999, 
"Combining sensitive database searches with multiple intermediates to detect distant 
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homologues." Protein Eng. 12:95-100. Another exemplary program package is the 
GCG™ package from the University of Wisconsin. 

Homologs may also or in addition be characterized by the ability of two 
complementary nucleic acid strands to hybridize to each other under appropriately 
5 stringent conditions. Hybridizations are typically and preferably conducted with 
probe-length nucleic acid molecules, preferably 20-100 nucleotides in length. Those 
skilled in the art understand how to estimate and adjust the stringency of hybridization 
conditions such that sequences having at least a desired level of complementarity will 
stably hybridize, while those having lower complementarity will not. For examples of 

10 hybridization conditions and parameters, see, e.g.,. Maniatis, T. et al. (1989) 

Molecular Cloning: A Laboratory Manual . Cold Spring Harbor University Press, Cold 
Spring, N.Y.; Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . 
John Wiley & Sons, Secaucus, N.J. Homologs and homologous gene sequences may 
thus be identified using any nucleic acid sequence of interest, including the phage 

1 5 ORFs and bacterial target genes of the present invention. 

A typical hybridization, for example, utilizes, besides the labeled probe of 
interest, a salt solution such as 6xSSC (NaCl and Sodium Citrate base) to stabilize 
nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with 
other typical additives such as Denhardt's solution and salmon sperm DNA. The 

20 solution is added to the immobilized sequence to be probed and incubated at suitable 
temperatures to preferably permit specific binding while minimizing nonspecific 
binding. The temperature of the incubations and ensuing washes is critical to the 
success and clarity of the hybridization. Stringent conditions employ relatively higher 
temperatures, lower salt concentrations, and/or more detergent than do non-stringent 

25 conditions. Hybridization temperatures also depend on the length, complementarity 
level, and nature (ie, "GC content") of the sequences to be tested. Typical stringent 
hybridizations and washes are conducted at temperatures of at least 40°C, while lower 
stringency hybridizations and washes are typically conducted at 37°C down to room 
temperature (~25°C). One of skill in the art is aware that these conditions may vary 

30 according to the parameters indicated above, and that certain additives such as 
formamide and dextran sulphate may also be added to affect the conditions. 

By "stringent hybridization conditions" is meant hybridization conditions at 
least as stringent as the following; hybridization in 50% formamide, 5X SSC, 50 mM 
NaH 2 P0 o pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X 

35 Denhart's solution at 42°C overnight; washing with 2X SSC, 0.1% SDS at 45°C; and 
washing with 0.2X SSC, 0.1% SDS at 45°C. 
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In sequence comparison analyses, an ORF, or motif, or set of motifs in a. 
bacteriophage sequence can be compared to known inhibitor sequences, e.g., 
homologous sequences encoding homologous inhibitors of bacterial function. 
Likewise, the analysis can include comparison with the structure of essential bacterial 
5 gene products, as structural similarities can be indicative of similar or replacement 
biological function. Such analysis can include the identification of a signature, or 
characteristic motifs) of an inhibitor or inhibitor class. 

Also, the identification of structural motifs in an encoded product, based on 
nucleotide or amino acid sequence analysis, can be used to infer a biochemical 

10 function for the product. A database containing identified structural motifs in a large 
number of sequences is available for identification of motifs in phage sequences. The 
database is PROSITE, which is available at www.expasy.ch/cgi-bin/scanprosite. The 
identification of motifs can, for example, include the identification of signature motifs 
for a class or classes of inhibitory proteins. Other such databases may also be used. 

1 5 In aspects and preferred embodiments described herein, in which a bacterium 

or host bacterium is specified, the bacterium or host bacterium is preferably selected 
from a pathogenic bacterial species, for example, one selected from Table 1. 
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium 
is a bird or mammalian pathogen, still more preferably a human pathogen. 

20 In aspects and preferred embodiments involving a bacteriophage or sequences 

from a bacteriophage, one or more bacteriophage are preferably selected from those 
listed in Table 1. Those exemplary bacteriophge are readily obtained from the 
indicated sources. 

In some cases, it is advantageous to utilize phage with non-pathogenic host 
25 bacteria. The genome, structural motif, ORF, homolog, and other analyses described 
herein can be performed on such phage and bacteria. Such analysis provides useful 
information and compositions. The results of such analyses can also be utilized in 
aspects of the present invention to identify homologous ORFs, especially inhibitor 
ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in 
30 a non-pathogenic host can be used to identify homologous sequences and targets in 
pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in 
the art are familiar with bacterial genetic relationships and with how to determine 
relatedness based on levels of genomic identity or other measures of nucleotide 
sequence and/or amino acid sequence similarity, and/or other physical and culture 
35 characteristics such as morphology, nutritional requirements, or minimal media to 
support growth. 
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Also in preferred embodiments, an embodiments of this aspect is combined 
with an embodiment of the following aspect. 

A related aspect of the invention provides methods for identifying a target for 
antibacterial agents by identifying the bacterial target(s) of at least one 
uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such 
identification allows the development of antibacterial agents active on such targets. 
Preferred embodiments for identifying such targets involve the identification of 
binding of target and phage ORF products to one another. The phage ORF products 
may be subportions of a larger ORF product that also binds the host target. In 
preferred embodiments, the phage protein or RNA is from an uncharacterized 
bacteriophage in Table 1. This aspect preferably includes the identification of a 
plurality of such targets in one or a plurality of different bacteria, preferably in one or 
a plurality of bacteria listed in Table 1. 

In preferred embodiments of this aspect and other aspects of this invention 
involving particular phage ORFs or phage sequences, the ORF is Staphylococcus 
aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 
09/407,804, S aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 1 82 ORF 002, 008, or 014. 

As indicated for the above aspect, preferably the method involves the use of a 
plurality of different phage, and thus a plurality of different phage inhibitors and/or 
inhibitor ORFs. 

In addition to uncharacteized phage ORF products, it is also useful to identify 
the targets of phage ORF products which are known to be inhibitors of host bacteria, 
but where the target has not been identified. Thus, such inhibitors can likewise be 
utilized as "untargeted" inhibitor phage ORFs and ORF products, e.g., proteins or 
RNAs. 

In the context of inhibitor proteins or RNAs from a phage, the term 
"uncharacterized" means that a bacteria-inhibiting function for the protein has not 
previously been identified. Preferably, but not necessarily, the sequence of the protein 
or the corresponding coding region or ORF was not described in the art before the 
filing of the present application for patent (or alternatively prior to the present 
invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein 
and its associated bacterial target which has been identified as inhibitory before the 
present invention or alternatively before the filing of the present application, for 
example those identified in Tables 12-14 or otherwise identified herein. For example, 
from E. coli, phage T7 genes 0.7 and 2.0 target the host RNA polymerase,- phage T4 
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gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product 
also targets the host translation apparatus. As with the uncharacterized bacteriophage 
ORFs or bacteriophage above, for such identified proteins, the sequences encoding 
those proteins are excluded from the uncharacterized inhibitor proteins. 
5 The term "fragment" refers to a portion of a larger molecule or assembly. For 

proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous 
amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 
15, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or 
polynucleotides, the term "fragment" refers to a molecule which includes at least 15 

10 contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 
45, 60, 90, 150, or more contiguous nucleotides. 

Preferred embodiments involve identification of binding that include methods 
for distinguishing bound molecules, for example, affinity chromatography, 
immunoprecipitation, crosslinking, and/or genetic screen methods that permit 

15 proteimprotein interactions to be monitored. One of skill in the art is familiar with 
these techniques and common materials utilized (see, e.g., Coligan, J. et al. (eds.) 
(1995) Current Protocols in Protein Science. John Wiley & Sons, Secaucus, N.J.). 

Genetic screening for the identification of proteinrprotein interactions typically 
involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the 

20 phage ORF to be tested) and a chimeric target nucleic acid sequence that, when co- 
expressed and having affinity for one another in a host cell, stimulate reporter gene 
expression to indicate the relationship. A "positive" can thus suggest a potential 
inhibitory effect in bacteria. This is discussed in further detail in the Detailed 
Description section below. In this way, new bacterial targets can be identified that are 

25 inhibited by specific phage ORF products or derivatives, fragments, mimetics, or 
other molecules. 

Other embodiments involve the identification and/or utilization of mutant 
targets by virtue of their host's relatively unresponsive nature in the presence of 
expression of ORFs previously identified as inhibitory to the non-mutant or wild-type 

30 strain. Such mutants have the effect of protecting the host from an inhibition that 
would otherwise occur and indirectly allow identification of the precise responsible 
target for follow-up studies and anti-microbial development. In certain embodiments, 
rescue from inhibition occurs under conditions in which a bacterial target or mutant 
target is highly expressed. This is performed, for example, through coupling of the 

35 sequence with regulatory element promoters, e.g., as known in the art, which regulate 
expression at levels higher than wild-type, e.g., at a level sufficiently higher that the 
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inhibitor can be competitively bound to the highly expressed target such that the 
bacterium is detectably less inhibited. 

Identification of the bacterial target can involve identification of a phage- 
specific site of action. This can involve a newly identified target, or a target where the 
phage site of action differs from the site of action of a previously known antibacterial 
agent or inhibitor. For example, phage T7 genes 0.7 and 2.0 target the host RNA 
polymerase, which is also the cellular target for the antibacterial agent, rifampin. To 
the extent that a phage product is found to act at a different site than previously 
described inhibitors, aspects of the present invention can utilize those new, phage- 
specific sites for identification and use of new agents. The site of action can be 
identified by techniques well-known to those skilled in the art, for example, by 
mutational analysis, binding competition analysis, and/or other appropriate 
techniques. 

Once a bacterial host target protein or nucleic acid or mutant target sequence 
has been identified and/or isolated, it too can be conveniently sequenced, sequence 
analyzed (e.g., by computer), and the underlying gene(s), and corresponding translated 
product(s) further characterized. Preferred embodiments include such analysis and 
identification. Preferably such a target has not previously been identified as an 
appropriate target for antibacterial action. 

Certain embodiments include the identification of at least one inhibitory phage 
ORF or ORF product, e.g., as described for the above aspect, and thus are a 
combination of the two aspects. 

Additionally, the invention provides methods for identifying targets for 
antibacterial agents by identifying homologs of a bacterial target e.g., S. aureus, 
Enterococcus faecalis or other Enterococci, and Streptococcus pneumoniae of a 
bacteriophage inhibitory ORF product. Such homologs may be utilized in the various 
aspects and embodiments described herein as describded for the host Enterococcus sp. 
for bacteriophage 182. 

Other aspects of the invention provide isolated, purified, or enriched specific 
phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for 
phage selected from uncharacterized phage listed in Table 1, preferably from 
bacteriophage 77, 3 A, 96, 44AHJD (Staphylococcus aureus host bacterium), Dp- 1 
(Streptococcus pneumoniae host), or 182 (Enterococcus host) or other phage listed in 
Table 1 for those bacteria. For example, such sequences do not include sequences 
identified in any of Tables 11-14. Nucleotide sequences of this aspect are at least 15 
nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more 
preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer 
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nucleic acids are preferred, for example those of at least 120, 150, 200, 300, 600, 900 
or more nucleotides. Such sequences can, for example, be amplification 
oligonucleotides (e.g., PCR primers), oligonucleotide probes, sequences encoding a 
portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded 
5 protein. In preferred embodiments, the nucleic acid sequence contains a sequence 
which is within a length range with a lower length as specified above, and an upper 
length limit which is no more than 50, 60, 70, 80, or 90% of the length of the 
corresponding full-length ORF. The upper length limit can also be expressed in terms 
of the number of base pairs of the ORF (coding region). In preferred embodiments, 
10 the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 
102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 
AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 
008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 
002, 008, or 014. 

15 As il is recognized that alternate codons will encode the same amino acid for 

most amino acids due to the degeneracy of the genetic code, the sequences of this 
aspect includes nucleic acid sequences utilizing such alternate codon usage for one or 
more codons of a coding sequence. For example, all four nucleic acid sequences 
GCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an 

20 amino acid there exists an average of three codons, a polypeptide of 100 amino acids 
in length will, on average, be encoded by 3 100 , or 5 x 10 47 , nucleic acid sequences. 
Thus, a nucleic acid sequence can be modified (e.g., a nucleic acid sequence from a 
phage as specified above) to form a second nucleic acid sequence encoding the same 
polypeptide as encoded by the first nucleic acid sequence using routine procedures 

25 and without undue experimentation. Thus, all possible nucleic acid sequences that 
encode the specified amino acid sequences are also fully described herein, as if all 
were written out in full, taking into account the codon usage, especially that preferred 
in the host bacterium. The alternate codon descriptions are available in common 
texbooks, for example, Stryer, BIOCHEMISTRY 3 rd ed., and Lehninger, 

30 BIOCHEMISTRY 3'" ed., along wth many others. Codon preference tables for 
various types of organisms are available in the literature. Sequences with alternate 
codons at one or more sites can also be utilized in the computer-related aspects and 
embodiments herein. Because of the number of sequence variations involving 
alternate codon usage, for the sake of brevity, individual sequences are not separately 

35 listed herein. Instead the alternate sequences are described by reference to the natural 
sequence with replacement of one or more (up to all e.g., up to 3, 5, 10, 15, 20, 30, 40, 
50, or more) of the degenerate codons with alternate codons from the alternate codon 
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table (Table 6), or a modified table applicable to a particular organism that has 
differing codon usage, preferably with selection according to preferred codon usage 
for the normal host organism or a host organism in which a sequence is intended to be 
expressed. Those skilled in the art also understand how to alter the alternate codons to 
5 be used for expression in organisms where certain codons code differently than shown 
in the "universal" codon table.; 

For amino acid sequences or polypeptides, sequences contain at least 5 peptide- 
linked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino 
acids having identical amino acid sequence as the same number of contiguous amino 
acid residues in a particular phage ORF product. In some cases longer sequences may 
be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in 
length. In preferred embodiments, the amino acid sequence contains a sequence which 
is within a length range with a lower length as specified above, and an upper length 
limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding 
full-length ORF product. The upper length limit can also be expressed in terms of the 
number of amino acid residues of the ORF product. In preferred embodiments, the 
amino acid sequence or polypeptide has bacteria-inhibiting function when expressed 
or otherwise present in a bacterial cell which is a host for the bacteriophage from 
which the sequence was derived. 

By "isolated" in reference to a nucleic acid is meant that a naturally occurring 
sequence has been removed from its normal cellular (e.g., chromosomal) environment 
or is synthesized in a non-natural environment (e.g., artificially synthesized). Thus, 
the sequence may be in a cell-free solution or placed in a different cellular 
environment. The term does not imply that the sequence is the only nucleotide chain 
present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide 
material naturally associated with it, and thus is distinguished from isolated 
chromosomes. 

The term "enriched" means that the specific DNA or RNA sequence 
constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present 
in the cells or solution of interest than in normal or diseased cells or in cells from 
which the sequence was originally taken. This could be caused by a person by 
preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched does not imply 
that there are no other DNA or RNA sequences present, just that the relative amount 
of the sequence of interest has been significantly increased. 
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The term "significant" is used to indicate that the level of increase is useful to 
the person making such an increase and an increase relative to other nucleic acids of 
about at least 2 -fold, more preferably at least 5- to 10-fold or even more. The term 
also does not imply that there is no DNA or RNA from other sources. The other 
source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a 
cloning vector such as pUC 1 9. This term distinguishes from naturally occurring 
events, such as viral infection, or tumor type growths, in which the level of one 
mRNA may be naturally increased relative to other species of mRNA. That is, the 
term is meant to cover only those situations in which a person has intervened to 
elevate the proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation). Instead, it represents an 
indication that the sequence is relatively more pure than in the natural environment 
(compared to the natural level, this level should be at least 2-5 fold greater, e.g., in 
terms of mg/mL). Individual clones isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules obtained from these 
clones could be obtained directly from total DNA or from total RNA. The cDNA 
clones are not naturally occurring, but rather are preferably obtained via manipulation 
of a partially purified naturally occurring substance (messenger RNA). The 
construction of a cDNA library from mRNA involves the creation of a synthetic 
substance (cDNA) and pure individual cDNA clones can be isolated from the 
synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation 
of distinct cDNA clones yields an approximately 10 6 -fold purification of the native 
message. Thus, purification of at least one order of magnitude, preferably two or 
three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The terms "isolated", "enriched", and "purified" as respect nucleic acids, 
above, may similarly be used to denote the relative purity and abundance of 
polypeptides ( multimers of amino acids joined one to another by a-carboxyl:a-amino 
group (peptide) bonds). These, too, may be stored in, grown in, screened in, and 
selected from libraries using biochemical techniques familiar in the art. Such 
polypeptides may be natural, synthetic or chimeric and may be extracted using any of 
a variety of methods, such as antibody immunoprecipitation, other "tagging" 
techniques, conventional chromatography and/or electrophoretic methods. Some of 
the above utilize the corresponding nucleic acid sequence. 
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As indicated above, aspects and embodiments of the invention are not limited 
to entire genes and proteins. The invention also provides and utilizes fragments and 
portions thereof, preferably those which are "active" in the inhibitory sense described 
above. Such peptides or oligopeptides and oligo or polynucleotides have preferred 
5 lengths as specified above for nucleic acid and amino acid sequences from phage; 
corresponding recombinant constructs can be made to express the encoded same. 
Also included are homologous sequences and fragments thereof. 

Nucleic acid sequences of the present invention can be isolated using a method 

similar to those described herein or other methods known to those skilled in the art. 
10 In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Also, by having particular phage ORFs, e.g., the phage ORF s 
identified herein {e.g., anti-bacterial ORFs of the present invention, portions thereof, 
or oligonucleotides derived therefrom as described), other antimicrobial sequences 
from other bacteriophage sources can be identified and isolated using methods 
15 described here or other methods, including methods utilizing nucleic acid 
hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage antimicrobial DNA segments from 
other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences that are 
20 highly homologous. The bacteriophage segment from a specific phage, e.g., an 

antimicrobial DNA segment, can be used to identify a related segment from another 
unrelated phage based on stringent conditions of hybridization or on being a homolog 
based on nucleic acid and/or amino acid sequence comparisons. As with identified 
inhibitory sequences, such homologous coding sequences and products can be used as 
25 antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

The nucleotide and amino acid sequences identified herein are believed to be 
correct, however, certain sequences may contain a small percentage of errors, e.g., 1- 
5%. In the event that any of the sequences have errors, the corrected sequences can be 

30 readily provided by one skilled in the art using routine methods. For example, the 
nucleotide sequences can be confirmed or corrected by obtaining and culturing the 
relevant phage, and purifying phage genomic nucleic acids. A region or regions of 
interest can be amplified, e.g., by PCR from the appropriate genomic template, using 
primers based on the described sequence. The amplified regions can then be 

35 sequenced using any of the available methods (e.g., a dideoxy termination method). 
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This can be done redundantly to provide the corrected sequence or to confirm that the 
described sequence is correct. Alternatively, a particular sequence or sequences can 
be identified and isolated as an insert or inserts in a phage genomic library and 
isolated, amplified, and sequenced by standard methods. Confirmation or correction 
5 of a nucleotide sequence for a phage gene provides an amino acid sequence of the 
encoded product by merely reading off the amino acid sequence according to the 
normal codon relationships and/or expressed in a standard expression system and the 
polypeptide product sequenced by standard techniques. The sequences described 
herein thus provide unique identification of the corresponding genes, coding 

10 sequences, and other sequences, allowing those sequences to be used in the various 
aspects of the present invention. 

In other aspects, the invention provides recombinant vectors and cells 
harboring at least one of the phage ORFs or portion thereof, or bacterial target 
sequences described herein. As understood by those skilled in the art, vectors may be 

15 provided in different forms, including, for example, plasmids, cosmids, and virus- 
based vectors. See, e.g., Maniatis, T. et al. (1989) Molecular Clonine: A Laboratory 
Manual . Cold Spring Harbor University Press, Cold Spring, N.Y.; See also, Ausubel, 
F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology . John Wiley & Sons, 
Secaucus, N.J. 

20 In preferred embodiments, the vectors will be expression vectors, preferably 

shuttle vectors that permit cloning, replication, and expression within bacteria. An 
"expression vector" is one having regulatory nucleotide sequences containing 
transcriptional and translational regulatory information that controls expression of the 
nucleotide sequence in a host cell. Preferably the vector is constructed to allow 

25 amplification from vector sequences flanking an insert locus. In certain embodiments, 
the expression vectors may additionally or alternativley support expression, and/or 
replication in animal, plant and/or yeast cells due to the presence of suitable 
regulatory sequences, e.g. , promoters, enhancers, 3' stabilizing sequences, primer 
sequences, etc. In preferred embodiments, the promoters are inducible and specific 

30 for the system in which expression is desired, e.g., bacteria, animal, plant, or yeast. 
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein 
purification. Convenient restriction enzyme cloning sites and suitable selective 
marker(s) are also optionally included. Such selective markers can be, for example, 
antibiotic resistance markers or markers which supply an essential nutritive growth 

35 factor to an otherwise deficient mutant host, e.g., tryptophan, histidine, or leucine in 
the Yeast Two-Hybrid systems described below. 
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The term "recombinant vector" relates to a single- or double-stranded circular 
nucleic acid molecule that can be transfected into cells and replicated within or 
independently of a cell genome. A circular double-stranded nucleic acid molecule can 
be cut and thereby linearized upon treatment with appropriate restriction enzymes. An 
assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the 
nucleotide sequences cut by restriction enzymes are readily available to those skilled 
in the art. A nucleic acid molecule encoding a desired product can be inserted into a 
vector by cutting the vector with restriction enzymes and ligating the two pieces 
together. Preferably the vector is an expression vector, e.g., a shuttle expression 
vector as described above. 

By " recombinant cell" is meant a cell possessing introduced or engineered 
nucleic acid sequences, e.g., as described above. The sequence may be in the form of 
or part of a vector or may be integrated into the host cell genome. Preferably the cell 
is a bacterial cell. 

In another aspect, the invention also provides methods for identifying and/or 
screening compounds "active on" at least one bacterial target of a bacteriophage 
inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial 
target or targets (e.g., bacterial target proteins) with a test compound, and determining 
whether the compound binds to or reduces the level of activity of the bacterial target 
(e.g., a bacterial target protein). Preferably this is done either in vivo (i.e., in a cell- 
based assay) or in vitro, e.g., in a cell-free system under approximately physiological 
conditions. 

The compounds that can be used may be large or small, synthetic or natural, 
organic or inorganic, proteinaceous or non-proteinaceous. In preferred embodiments, 
the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor 
protein or fragment or derivative thereof, preferably an "active portion", or a small 
molecule. 

In preferred embodiments, the bacterial target is a target of a phage ORF 
identified herein, e.g., S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

In particular embodiments, the methods include the identification of bacterial 
targets or the site of action of an inhibitor on a bacterial target as described above or 
otherwise described herein. 

In embodiments involving binding assays, preferably binding is to a fragment 
or portion of a bacterial target protein, where the fragment includes less than 90%, 
80%, 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, 
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the at least one bacterial target includes a plurality of different targets of 
bacteriophage inhibitor proteins, preferably a plurality of different targets. The 
plurality of targets can be in or from a plurality of different bacteria, but preferably is 
from a single bacterial species. 

A "method of screening" refers to a method for evaluating a relevant activity 
or property of a large plurality of compounds (e.g., a bacteria-inhibiting activity), 
rather than just one or a few compounds. For example, a method of screening can be 
used to conveniently test at least 100, more preferably at least 1000, still more 
preferably at least 10,000, and most preferably at least 100,000 different compounds, 
or even more. 

In the context of this invention, the term "small molecule" refers to 
compounds having molecular mass of less than 2000 Daltons, preferably less than 
1500, still more preferably less than 1000, and most preferably less than 600 Daltons. 
Preferably but not necessarily, a small molecule is not an oligopeptide. 

In a related aspect or in preferred embodiments, the invention provides a 
method of screening for potential antibacterial agents by determining whether any of a 
plurality of compounds, preferably a plurality of small molecules, is active on at least 
one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments 
include those described for the above aspect, including embodiments, which involve 
determining whether one or more test compounds bind to or reduce the level of 
activity of a bacterial target, and embodiments which utilize a plurality of different 
targets as described above. 

The identification of bacteria-inhibiting phage ORFs and their encoded 
products also provides a method for identifying an active portion of such an encoded 
product. This also provides a method for identifying a potential antibacterial agent by 
identifying such an active portion of a phage ORF or ORF product. In preferred 
embodiments, the identification of an active portion involves one or more of 
mutational analysis, deletion analysis, or analysis of fragments of such products. The 
method can also include determination of a 3-dimensional structure of an active 
portion, such as by analysis of crystal diffraction patterns. In further embodiments, 
the method involves constructing or synthesizing a peptidomimetic compound, where 
the structure of the peptidomimetic compound corresponds to the structure of the 
active portion. In this context, "corresponds" means that the peptidomimetic 
compound structure has sufficient similarities to the structure of the active portion that 
the peptidomimetic will interact with the same molecule as the phage protein and 
preferably will elicit at least one cellular response in common which relates to the 
inhibition of the cell by the phage protein. 
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In preferred embodiments, the ORF or ORF product is or is derived or 
obtained from S aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof. 

The methods for identifying or screening for compounds or agents active on a 
bacterial target of a phage-encoded inhibitor can also involve identification of a 
phage-specific site of action on the target. 

Preferably in the methods for identifying or screening for compounds active 
on such a bacterial target, the target is uncharacterized; the target is from an 
uncharacterized bacterium from Table 1; the site of action is a phage-specfic site of 
action. 

Further embodiments include the identification of inhibitor phage ORFs and 
bacterial targets as in aspects above. 

An "active portion" as used herein denotes an epitope, a catalytic or regulatory 
domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a 
significant factor in, bacterial target inhibition. The active portion preferably may be 
removed from its contiguous sequences and, in isolation, still effect inhibition. 

By "mimetic" is meant a compound structurally and functionally related to a 
reference compound that can be natural, synthetic, or chimeric. In terms of the present 
invention, a "peptidomimetic," for example, is a compound that mimics the activity- 
related aspects of the 3-dimensional structure of a peptide or polyeptide in a non- 
pep tide compound, for example mimics the structure of a peptide or active portion of 
a phage- or bacterial ORF-encoded polypeptide. 

A related aspect provides a method for inhibiting a bacterial cell by contacting 
the bacterial cell with a compound active on a bacterial target of a bacteriophage 
inhibitor protein or RNA, where the target was uncharacterized. In preferred 
embodiments, the compound is such a protein, or a fragment or derivative thereof; a 
structural mimetic, e.g., a peptidomimetic, of such a protein or fragment; a small 
molecule; the contacting is performed in vitro, the contacting is performed in vivo in 
an infected or at risk organism, e.g., an animal such as a mammal or bird, for example, 
a human, or other mammal described herein; the bacterium is selected from a genus 
and/or species listed in Table 1; the bacteriophage inhibitor protein is uncharacterized; 
the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table 1; 
the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 
029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 



22 



In the context of targets in this invention, the term "uncharacterized" means 
that the target was not recognized as an appropriate target for an antibacterial agent 
prior to the filing of the present application or alternatively prior to the present 
invention. Such lack of recognition can include, for example, situations where the 
5 target and/or a nucleotide sequence encoding the target were unknown, situations 
where the target was known, but where it had not been identified as an appropriate 
target or as an essential cellular component, and situations where the target was 
known as essential but had not been recognized as an appropriate target due to a belief 
that the target would be inaccessible or otherwise that contacting the cell with a 

10 compound active on the target in vitro would be ineffective in cellular inhibition, or 
ineffective in treatment of an infection. Methods described herein utilizing bacterial 
targets, e.g., for inhibiting bacteria or treating bacterial infections, can also utilize 
"uncharacterized target sites", meaning that the target has been previously recognized 
as an appropriate target for an antibacterial agent, but where an agent or inhibitor of 

15 the invention is used which acts at a different site than that at which the previously 
utilized antibacterial agent, i.e., a phage-specific site. Preferably the phage-specific 
site has different functional characteristics from the previously utilized site. In the 
context of targets or target sites, the term "phage-specific" indicates that the target or 
site is utilized by at least one bacteriophage as an inhibitory target and is different 

20 from previously identified targets or target sites. 

In "the context of this invention, the term "bacteriophage inhibitor protein" 
refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits 
bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product. 
In the context of this invention, the phrase "contacting the bacterial cell with a 

25 compound active on a bacterial target of a , bacteriophage inhibitor protein" or 
equivalent phrases refer to contacting with an isolated, purified, or enriched 
compound or a composition including such a compound, but specifically does not rely 
on contacting the bacterial cell with an intact phage which encodes the compound. 
Preferably no intact phage are involved in the contacting. 

30 Related aspects provide methods for prophylactic or therapeutic treatment of a 

bacterial infection by administering to an infected, challenged or at risk organism a 
therapeutically or prophylactically effective amount of a compound active on a target 
of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect. 
Preferably the bacterium involved in the infection or risk of infection produces the 

35 identified target of the bacteriophage inhibitor protein or alternatively produces a 

homologous target compound. In preferred embodiments, the host organism is a plant 
or animal, preferably a mammal or bird, and more preferably, a human or-other 
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mammal described herein. Preferred embodiments include, without limitation, those 
as described for the preceding aspect. 

Compounds useful for the methods of inhibiting, methods of treating, and 
pharmaceutical compositions can include novel compounds, but can also include 
compounds which had previously been identified for a purpose other than inhibition 
of bacteria. Such compounds can be utilized as described and can be included in 
pharmaceutical compositions. 

In preferred embodiments of this and other aspects of the invention utilizing 
bacterial target sequences of a bacteriiophage inhibitory ORF product, the target 
sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S. 
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus 
pneumoniae, or Enterococcus nucleic acid coding sequence. Possible target 
sequences are described herein by reference to sequence source sites. 

The amino acid sequence of a polypeptide target is readily provided by 
translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. For the sake of brevity, the sequences are described by 
reference to the GenBank entries instead of being written out in full herein. In cases 
where the TIGR or GenBank entry for a coding region is not complete, the complete 
sequence can be readily obtained by routine methods, e.g., by isolating a clone in a 
phage host genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

In the context of nucleic acid or amino acid sequences of this invention, the 
term "corresponding" indicates that the sequence is at least 95% identical, preferably 
at least 97% identical, and more preferably at least 99% identical to a sequence from 
the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent 
(utilizing one or more degenerate codons), or a homologous sequence, where the 
homolog provides functionally equivalent biological function. 

By "treatment" or "treating" is meant administering a compound or 
pharmaceutical composition for prophylactic and/or therapeutic purposes. The term 
"prophylactic treatment" refers to treating a patient or animal that is not yet infected 
but is susceptible to or otherwise at risk of a bacterial infection. The term "therapeutic 
treatment" refers to administering treatment to a patient already suffering from 
infection. 
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The term "bacterial infection" refers to the invasion of the host organism, 
animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria 
which are normally present in or on the body of the organism, but more generally, a 
bacterial infection can be any situation in which the presence of a bacterial 
5 population(s) is damaging to a host organism. Thus, for example, an organism suffers 
from a bacterial population when excessive numbers of a bacterial population are 
present in or on the organism's body, or when the effects of the presence of a bacterial 
population(s) is damaging to the cells, tissue, or organs of the organism. 

The terms "administer", "administering", and "administration" refer to a 

10 method of giving a dosage of a compound or composition, e.g., an antibacterial 

pharmaceutical composition, to an organism. Where the organism is a mammal, the 
method is, e.g., topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, 
or intrathecal. The preferred method of administration can vary depending on various 
factors, e.g., the components of the pharmaceutical composition, the site of the 

1 5 potential or actual bacterial infection, the bacterium involved, and the infection 
severity. 

The term "mammal" has its usual biological meaning referring to any 
organism of the Class Mammalia of higher vertebrates that nourish their young with 
milk secreted by mammary glands, e.g., mouse, rat, and, in particular, human, bovine, 

20 sheep, swine, dog, and cat. 

In the context of treating a bacterial infection a "therapeutically effective 
amount" or "pharmaceutically effective amount" indicates an amount of an 
antibacterial agent, e.g., as disclosed for this invention, which has a therapeutic effect. 
This generally refers to the inhibition, to some extent, of the normal cellular 

25 functioning of bacterial cells that renders or contributes to bacterial infection. 
The dose of antibacterial agent that is useful as a treatment is a 
"therapeutically effective amount." Thus, as used herein, a therapeutically effective 
amount means an amount of an antibacterial agent that produces the desired 
therapeutic effect as judged by clinical trial results and/or animal models. This amount 

30 can be routinely determined by one skilled in the art and will vary depending on 
several factors, such as the particular bacterial strain involved and the particular 
antibacterial agent used. 

In connection with claims to methods of inhibiting bacteria and therapeutic or 
prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor 

35 protein" or terms of equivalent meaning differ from administration of or contact with 
an intact phage naturally encoding the frill-length inhibitor compound. While an 
intact phage may conceivably be incorporated in the present methods, the-method at 
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least includes the use of an active compound as specified different from a full length 
inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting 
method different from administration of or contact with an intact phage encoding the 
full-length protein. Similarly, pharmaceutical compositions described herein at least 
5 include' an active compound different from a full-length inhibitor protein naturally 
encoded by a bacteriophage or such a full-length protein is provided in the 
composition in a form different from being encoded by an intact phage. Preferably 
the methods and compositions do not include an intact phage. 

In accord with the above aspects, the invention also provides antibacterial 

10 agents and compounds active on bacterial targets of bacteriophage inhibitor proteins 
or RNAs, where the target was uncharacterized as indicated above. As previously 
indicated, such active compounds include both novel compounds and compounds 
which had previously been identified for a purpose other than inhibition of bacteria. 
Such previously identified biologically active compounds can be used in 

1 5 embodiments of the above methods of inhibiting and treating. In preferred 

embodiments, the targets, bacteriophage, and active compound are as described herein 
for methods of inhibiting and methods of treating. Preferably the agent or compound 
is formulated in a pharmaceutical composition which includes a pharmaceutical^ 
acceptable carrier, excipient, or diluent. In addition, the invention provides agents, 

20 compounds, and pharmaceutical compositions where an active compound is active on 
an uncharacterized phage-specific site. 

In preferred embodiments, the target is as described for embodiments of 
aspects above. 

Likewise, the invention provides a method of making an antibacterial agent. 

25 The method involves identifying a target of a bacteriophage inhibitor polypeptide or 
protein or RNA, screening a plurality of compounds to identify a compound active on 
the target, and synthesizing the compound in an amount sufficient to provide a 
therapeutic effect when administered to an organism infected by a bacterium naturally 
producing the target. In preferred embodiments, the identification of the target and 

30 identification of active compounds include steps or methods and/or components as 
described above (or otherwise herein) for such identification. Likewise, the active 
compound can be as described above, including fragments and derivatives of phage 
inhibitor proteins, peptidomimetics, and small molecules. As recognized by those 
skilled in the art, peptides can be synthesized by expression systems and purified, or 

35 can be synthesized artificially. In preferred embodiments the inhibitory phage ORF 
products is from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
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pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

As indicated above, sequence analysis of nucleotide and/or amino acid 
sequences can beneficially utilize computer analysis. Thus, in additional aspects the 
5 invention provides computer-related hardware and media and methods utilizing and 
incorporating sequence data from uncharacterized phage, e.g., uncharacterized phage 
listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus 
phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 
002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 

10 182 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 1 82, or 

Streptococcus pneumoniae phage Dp-1. In general, such aspects can facilitate the 
above-described aspects. Various embodiments involve the analysis of genetic 
sequence and encoded products, as applied to the evaluating bacteriophage inhibitor 
ORFs and compounds and fragments related thereto. The various sequence analyses, 

15 as well as function analyses, can be used separately or in combination, as well as in 
preceding aspects and embodiments. Use in combination is often advantageous as the 
additional information allows more efficient prioritizing of phage ORFs for 
identification of those ORFs that provide bacteria-inhibiting function. 

In one aspect, the invention provides a computer-readable device which 

20 includes at least one recorded amino acid or nucleotide sequence corresponding to one 
of the specified phage and a sequence analysis program for analyzing a nucleotide 
and/or amino acid sequence. The device is arranged such that the sequence 
information can be retrieved and analyzed using the analysis program. The analysis 
can identify, for example, homologous sequences or the indicated %s of the phage 

25 genome and structural motifs. Preferably the sequence includes at least 1 phage ORF 
or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, 90%, 
or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid 
sequences. Preferably the sequence or sequences in the device are recorded in a 
medium such as a floppy disk, a computer hard drive, an optical disk, computer 

30 random access memory (RAM), or magnetic tape. The program may also be recorded 
in such medium. The sequences can also include sequences from a plurality of 
different phage. 

In this context, the term "corresponding" indicates that the sequence is at least 
95% identical, preferably at least 97% identical, and more preferably at least 99% 
35 identical to a sequence from the specified phage genome, a ribonucleotide equivalent, 
a degenerate equivalent (utilizing one or more degenerate codons), or a homologous 
sequence, where the homolog provides functionally equivalent biological function. 
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Similarly, the invention provides a computer analysis system for identifying 
biologically important portions of a bacteriophage genome. The system includes a 
data storage medium, e.g., as identified above, which has recorded thereon a 
nucleotide sequence corresponding to at least a portion of at least one uncharacterized 
bacteriophage genome, a set of program instructions to allow searching of the 
sequence or sequences to analyze the sequence, and an output device where the 
portion includes at least the sequence length as specified in the preceding aspect. The 
output device is preferably a printer, a video display, or a recording medium. More 
one than one output device may be included. For each of the present computer-related 
asepcts, the bacteriophage are preferably selected from the uncharacterized phage 
listed in Table 1, more preferably from bacteriophage 77, 3 A, 96, 44 AHJD (S. 
aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enterococcus). 

In keeping with the computer device aspects, the invention also provides a 
method for identifying or characterizing a bacteriophage ORF by providing a 
computer-based system for analyzing nucleotide or amino acid sequences, e.g., as 
describe above. The system includes a data storage medium which has recorded a 
sequences or sequences as described for the above devices, a set of instructions as in 
the preceding aspect, and an output device as in the preceding aspect. The method 
further involves analyzing at least one sequence, and outputting the analysis results to 
at least one output device. 

In preferred embodiments, the analysis identifies a sequence similarity or 
homology with a sequence or sequences selected from bacterial ORFs encoding 
products with related biological function; ORFs encoding known inhibitors; and 
essential bacterial ORFs. Preferably the analysis identifies a probable biological 
function based on identification of structural elements or characteristic or signature 
motifs of an encoded product or on sequence similarity or homology. Preferably the 
uncharacterized bacteriophage is from Table 1, more preferably at least one of 
bacteriophage 77, 3A, 96, 44 AHJD (S. aureus), Dp-1 (Streptococcus pneumoniae), or 
182 (Enterococcus). In preferred embodiments, the method also involves determining 
at least a portion of the nucleotide sequence of at least one uncharacterized 
bacteriophage as indicated, and recording that sequence on data storage medium of the 
computer-based system. In preferred embodiments, the analysis identifies a sequence 
similarity of homology with a S. aureus phage 44 AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 
029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 
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As used in the claims to describe the various inventive aspects and 
embodiments, "comprising" means including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other elements are optional and may or 
5 may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of. Thus, the phrase "consisting of indicates that the 
listed elements are required or mandatory, and that no other elements may be present. 
By "consisting essentially of is meant including any elements listed after the phrase, 
and limited to other elements that do not interfere with or contribute to the activity or 

10 action specified in the disclosure for the listed elements. Thus, the phrase "consisting 
essentially of 1 indicates that the listed elements are required or mandatory, but that 
other elements are optional and may or may not be present depending upon whether or 
not they affect the activity or action of the listed elements. 

Further embodiments will be apparent from the following Detailed Description 

15 and from the claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIGURE 1 A and IB are flow schematics showing the manipulations used to 

convert pT0021, an arsenite inducible vector containing the luciferase gene, into 
pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH I, Sal I, and 
Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam 
HI and Hind III cloning sites and no HA epitope tag. 

25 

FIGURE 2 is a schematic representation of the cloning steps involved to place 
the DNA segments of any of ORFs 17/ 19/ 43/ 102/104/182 or other sequences into 
pTHA to assess inhibitory potential. For subcloning into pTM or pT0021, Individual 
ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop 
30 codons of the ORFs. Using this strategy, Bam HI and Hind III sites were positioned 
immediately upstream or downstream, respectively of the start and stop codons of 
each ORF. Following digestion with Bam HI and Hind m, the PCR fragments were 
subcloned into the same sites of pT0021 or pTM. Clones were verified by PCR and 
direct sequencing. 
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FIGURE 3 shows a schematic representation of the functional assays used to 
characterize the bactericidal and bacteriostatic potential of all predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Fig. 3A) Functional assay on semi-solid 
5 support media. Fig. 3B) Functional assay in liquid culture. 

FIGURE 4 A, B, and C is a bar graph showing the results of a screen in liquid 
media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Growth inhibition assays were performed 

1 0 as detailed in the Detailed Description. The relative growth of Staphylococcus aureus 
transformants harboring a given bacteriophage 77 ORF (identified on the bottom of 
the graph), in the absence or presence of arsenite, is plotted relative to growth of a 
Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 
ORF (which is set at 100%). Each bar represents the average obtained from three 

15 Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing 
significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182. 

FIGURE 5 shows a block diagram of major components of a general purpose 
computer. 

20 

FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage 
Dp- 1 showing the ORF identifiers, genomic locations, and orientations of the 85 
identified ORFs that were found to have ribosomal binding sites and thus are expected 
to be expressed. 

25 

FIGURE 7 shows a schematic representation of the arsenite-inducible 
expression system present in a shuttle vector designed to express individual 
Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can 
be readily made to such a vector, or other vectors can be readily constructed to 
30 provide inducible expression of ORFs in a particular host bacterium using well-known 
techniques. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The invention may be more clearly understood from the following description. 
5 The tables will first be briefly described. ^ 

Table 1 is a listing of a large number of available bacteriophage that can be 
readily obtained and used in the present invention. 

Table 2 shows the complete nucleotide sequence of the genome of 
Staphylococcus aureus bacteriophage 77. 
10 Table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened 

in the functional assay to identify those with anti-microbial activity. 

Table 4 shows the predicted nucleotide sequence, predicted amino acid 
sequence, and physiochemical parameters of ORF 17/ 19/ 43/ 102/ 104/ 182]. These 
include the primary amino acid sequence of the predicted protein, the average 
15 molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and 
predicted secondary structure map. 

Table 5 shows homology search results. BLAST analysis was performed with 
ORFs 17/ 19/ 43/ 102/ 104/ 182 against NCBI non-redundant nucleotide and 
Swissprot databases. The results of this search indicate that: I) ORF 17 has no 
20 significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide 
database, II) ORF 19 has significant homology to one gene in the NCBI non- 
redundant nucleotide database - the gene encoding ORF 59 of bacteriophage phi PVL, 
III) ORF 43 has significant homology to one gene in the NCBI non-redundant 
nucleotide database - the gene encoding ORF 39 of phi PVL, IV) ORF 102 has 
25 significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 38 of phi PVL, V) ORF 104 has no significant homology to 
any gene in the NCBI non-redundant nucleotide database, VI) ORF 182 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 39 of phi PVL. 
30 Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE 

CELL 3 rd ed., showing the redundancy of the '"universal" genetic code. 

Table 7 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 3A. 
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Table 8 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 3A. 

Table 9 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 96. 

5 Table 10 is a listing of the ORFs identified in Staphylococcus aureus 

bacteriophage 96. 

Table 1 1 is a listing of sequences deposited in the NCBI public database 
(GeneBank) for bacteriophage listed in Table 1 . 

Table 12 is a listing of phage which encode a known lysis function , including 
10 the identified lysis gene. 

Table 13 is a listing of bacteriophage which encode holin genes, where holin 
genes encode proteins which form pores and eventually enable other enzymes to kill 
the host bacterium. 

Table 14 is a listing of bacteriophage which encode kil genes. 

15 Table 15 is a list of Staphylococcus aureus sequences identified by accession 

number which may include sequences from genes coding for target sequences for the 

phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained 

by searching GenBank for listings. 

Table 16 shows the nucleotide sequence of the genome of Staphylococcus 

20 aureus phage 44 AHJD. 

Table 17 lists and shows the sequence position of the 73 ORFs predicted to be 

encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 

amino acids. 

Table 18 shows the ORF sequences and putative amino acid sequences for the 
25 . Staphylococcus aureus bacteriophage 44AHJD ORFs greater than 33 amino acids. 

Table 19 shows the similarities in sequence identified between predicted 
Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public 
databases. 

Table 20 shows the homology alignments between predicted Staphylococcus 
30 aureus bacteriophage 44 AHJD ORFs and the corresponding protein sequences present 
in public sequence databases. 

Table 21 shows the complete nucleotide sequence of the genome of 
Enterococcus bacteriophage 182. 

Table 22 lists and shows the sequence position of the 80 ORFs identified in 
35 bacteriophage 182 and that are greater than 33 amino acids. 
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Table 23 shows the nucleotide and predicted amino acid sequence of all 80 
ORPs identified in bacteriophage 182. 

Table 24 shows the similarities identified to date in sequence between 
Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in 
5 public sequence databases. 

Table 25 shows the predicted amino acid sequence as well as the predicted 
secondary structures map for two Enterococcus bacteriophage 182 ORFs. 

Table 26 shows the homology alignments between predicted Enterococcus 
bacteriophage 1 82 ORFs and the. corresponding protein sequences present in public 
10 sequence databases. 

Table 27 list Enterococcus sequences listed in GenBank providing possible 
Enterococcal target sequences for inhibitory Enterococcus bacteriophage 182 ORFs 
and other compounds with antibacterial activity. 

Table 28 shows the complete nucleotide sequence of the genome of 
1 5 Streptococcus bacteriophage Dp- 1 . 

Table 29 lists and shows sequence position of the 273 ORFs identified in 
Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which 
are predicted to be expressed in Dp-1 as having a ribosomal binding site. That set of 
85 ORFs is shown in the attached drawings. 
20 Table 30 shows the nucleotide and predicted amino acid sequence of all 273 

ORFs identified in bacteriophage Dp-1 that are identified as being expressed. 

Table 31 shows the similarities identified in sequence between Streptococcus 
phage Dp-1 ORFs greater than 33 amino acids and sequences present in public 
sequence databases. 

25 Table 32 shows the 473 1 bp sequence of Dp-1 published by Sheehan et al., 

1997). 

Table 33 lists Streptococcus pneumoniae sequences listed in GenBank 
providing possible target sequences for inhibitory Streptococcus pneumoniae 
bacteriophage Dp-1 ORFs and other compounds with antibacterial activity 

30 

Background: 

As indicated above, the present invention is concerned, in part, with the use of 
bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to 
identify bacterial targets for potential new antibacterial agents. Thus, the invention 
35 concerns the selection of relevant bacteria. Particularly relevant bacteria are those 
which are pathogens of a complex organism such as an animal, e.g., mammals, 
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reptiles, and birds, and plants. Examples include Stapylococciis aureus, Enterococcus 
species, and Streptococcus pneumoniae. However, the invention can be applied to 
any bacterium (whether pathogenic or not) for which bacteriophage are available or 
which are found to have cellular components closely homologous to components 
targeted by phage of another bacterium. 

Thus, the invention also concerns the bacteriophage which can infect a 
selected bacterium. Identification of ORFs or products from the phage which inhibit 
the host bacterium both provides an inhibitor compound and allows identification of 
the bacterial target affected by the phage-encoded inhibitor. Such targets are thus 
identified as potential targets for development of other antibacterial agents or 
inhibitors and the use of those targets to inhibit those bacteria. As indicated above, 
even if such a target is not initially identified in a particular bacterium, such a target 
can still be identified if a homologous target is identified in another bacterium. 
Usually, but not necessarily, such another bacterium would be a genetically closely 
related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit 
such a homologous bacterial cellular component. 

The demonstration that bacteriophage have adapted to inhibiting a host 
bacterium by acting on a particular cellular component or target provides a strong 
indication that that component is an appropriate target for developing and using 
antibacterial agents, e.g., in therapeutic treatments. Thus, the present invention 
provides additional guidance over mere identification of bacterial essential genes, as 
the present invention also provides an indication of accessability of the target to an 
inhibitor, and an indication that the target is sufficiently stable over time (e.g., not 
subject to high rates of mutation) as phage acting on that target were able to develop 
and persist. Thus, the present invention identifies a subset of essential cellular 
components which are particularly likely to be appropriate targets for development of 

antibacterial agents. 

The invention also, therefore, concerns the development or identification of 
inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA 
transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As 
described herein, such inhibitors can be of a variety of different types, but are 
preferably small molecules. 

The following description provides preferred methods for use in the various 
aspects of the invention. However, as those skilled in the art will readily recognize, 
other approaches can be used to obtain and process relevant information. Thus the 
invention is not limited to the specifically described methods. In addition, the 
following description provides a set of steps in a particular order. That series of steps 
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describes the overall development involved in the present invention. However, it is 
clear that individual steps or portions of steps may be usefully practiced separately, 
and, further, that certain steps may be performed in a different order or even bypassed 
if appropriate information is already available or is provided by other sources or 
5 methods. 

Selecting and Growing Phage, and Isolating DNA 

Conceptually, the first step involves selecting bacterial hosts of interest. 
Preferably, but not necessarily, such hosts will be pathogens of clinical importance. 

10 Alternatively, because bacteria all share certain fundamental metabolic and structural 
features, these features can be targeted for study in one strain, for example a 
nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones. 
Nonpathogenic strains may also exhibit initial advantages in being not only less 
dangerous, but also, for example, in having better growth and culturing characteristics 

15 and/or better developed molecular biology techniques and reagents. Consequently, 
advantageously the invention provides the ability target virtually any bacteria, but 
preferably pathogenic bacteria, with antimicrobial compounds designed and/or 
developed using bacteriophage inhibitory proteins and peptides from phage with non- 
pathogenic and/or pathogenic hosts. 

20 We have selected Staphylococcus aureus, Streptococcus pneumoniae, various 

Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These 
bacteria are a major cause of morbidity and mortality in hospital-based infections, and 
the appearance of antibiotics resistance in all three organisms makes it increasingly 
difficult to treat benign infections involving these organisms. Such infections can 

25 include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, 
H.C. (1992). Science 257, 1064-1073). However, the approach described below is 
clearly applicable to any human bacterial pathogens including but not restricted to 
Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, 
Acinobacter, Escherichia coli, Shigella dysenteria, Streptococcus pyogenes, 

30 Helicobacter pylori, and Mycoplasma species. This invention can also be applied to 
the discovery of anti-bacterial compounds directed against pathogens of animals other 
than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles. 
Similarly, the invention is not limited to animals, but also applies to plants and plant 
pathogens. 

35 In general, the bacteria are grown according to standard methodologies 

employed in the art, including solid, semi-solid or liquid culturing, which procedures 
can be found in or extrapolated from standard sources such as Maloy, S.R:, Stewart, 
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VJ., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring 
Harbor Laboratory Press, or Maniatis, T. et al. (1989) Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; or 
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . John Wiley & 
Sons, Secaucus, N.J. Culture conditions are selected which are adapted to the 
particular bacterium generally using culture conditions known in the art as 
appropriate, or adaptations of those conditions. 

Nucleic acids within these bacteria can be routinely extracted through common 
procedures such as described in the above-referenced manuals and as generally known 
to those skilled in the art. Those nucleic acid stocks can then be used to practice the 
other inventive aspects described below. 

Selection and Growth of Bacteriophage, and Isolation of DNA 

The second step involves assembling a group of bacteriophages (phage 
collection) for one or more of the targeted bacterial hosts. While the invention can be 
utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable 
to utilize a plurality of phage for each bacterium, as comparisons between a plurality 
of such phage provides useful additional information. Non-limiting examples of 
phage and sources for some of the above-mentioned pathogenic bacteria are found in 
Table 1. The criteria used to select such phages is that they are infectious for the 
microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium 
in a measurable fashion. These phages can be very different from one another 
(representing different families), as judged by criteria such as morphology (head, tail, 
plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since 
such diverse bacteriophages are expected to block bacterial host metabolism and 
ultimately inhibit by a variety of mechanisms, their combined study will lead to the 
identification of different mechanisms by which the phages independently inhibit 
bacterial targets. Examples include degradation of host DNA (Parson K. A., and 
Snustad, D.P. (1975). /. Virol 15, 221-444) and inhibition of host RNA transcription 
(Severinova, E., Severinov, K. and Darst, S.A. (1998;. JMol Biol 279, 9-18). This, 
in turn, yields novel information on phage proteins that can inhibit the targeted 
microbe. As explained below, this 1) forms the basis of novel drug discovery efforts 
based on knowledge of the primary amino acid sequence of the phage inhibitor 
protein {e.g., peptide fragments or peptidomimetics) and/or 2) leads to the 
identification of bacterial biochemical pathways, the proteins of which are essential or 
significant for survival of the targeted microbe, and which enzymatic steps or 
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chemical reactions can be targeted by classical drug discovery methods using 
molecular inhibitors, for example, small molecule inhibitors. 

Bacteriophage are generally either of two types, lytic or filamentous, meaning 
they either outright destroy their host and seek out new hosts after replication, or else 
5 continuously propogate and extrude progeny phage from the same host without 
destroying it. Regardless of the phage life cycle and type, preferred embodiments 
incorporate phage which impede cell growth in measurable fashion and preferably 
stop cell growth. To this end, lytic phage are preferred, although certain nonlytic 
species may also suffice, e.g., if sufficiently bacteriostatic. 

10 Various procedures that are commonly understood by those of skill in the art 

can be routinely employed to grow, isolate, and purify phage. Such procedures are 
exemplified by those found in such common laboratory aids such as Maloy, S.R., 
Stewart, V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold 
Spring Harbor Laboratory Press; Maniatis, T. et al. (1989) Molecular Cloning: A 

15 Laboratory Manual Cold Spring Harbor University Press, Cold Spring, N.Y.; and 
Ausubel, F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology . John 
Wiley & Sons, Secaucus, N.J. The techniques generally involve the culturing of 
infected bacterial cells that are lysed naturally and/or chemically assisted, for 
example, by the use of an organic solvent such as chloroform that destroys the host 

20 cells thereby liberating the phage within. Following this, the cellular debris is 

centrifuged away from the supernatant containing the phage particles, and the phage 
then subsequently and selectively precipitated out of the supernatant using various 
methods usually employing the use of alcohols and/or other chemical compounds 
such as polyethylene glycol (PEG). The resulting phage can be further purified using 

25 various density gradient/centrifugation methodologies. The resulting phage are then 
chemically lysed, thereby releasing their nucleic acids that can be conveniently 
precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of 
interest. 

Exemplary bacteriophage are indicated in Table 1, along with sources where 
30 those phage may be obtained. 

Exemplary bacteria include the reference bacteria for the identified 
bacteriophage, available from the same sources. 

Characterizing Bacteriophage Genomes for ORFs 
35 The third step involves systematically characterizing the genetic information 

contained in the phage genome. Within this genetic information is the sequence of all 
RNAs and proteins encoded by the phage, including those that are essential or 
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instrumental in inhibiting their host. This characterization is preferably done in a 
systematic fashion. For example, this can be done by first isolating high molecular 
weight genomic DNA from the phage using standard bacterial lysis methods, followed 
by phage purification using density gradient ultracentrifugation, and extraction of 
5 nucleic acid from the purified phage preparation. The high molecular weight DNA is 
then analyzed to determine its size and to evaluate a proper strategy for its sequencing. 
The DNA is broken down into smaller size fragments by soriication or partial 
digestion with frequently cutting restriction enzymes such as Sau3A to yield 
predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel 

10 electrophoresis followed by extraction from the gel. 

The ends of the fragments are enzymatically treated to render them suitable for 
cloning and the pools of fragments are cloned in a bacterial plasmid to generate a 
library of the phage genome. Several hundred of these random DNA fragments 
contained in the plasmid vector are isolated as clones after introduction into an 

1 5 appropriate bacterium, usually Escherichia colu They are then individually expanded 
in culture and the DNA from each individual clone is purified. The nucleotide 
sequences of the inserts of these clones are determined by standard automated or 
manual methods, using oligonucleotide primers located on either side of the cloning 
site to direct polymerase mediated sequencing (e.g., the Sanger sequencing method or 

20 a modification of that method). Other sequencing methods can also be used. 

The sequence of individual clones is then deposited in a computer, and 
specific software programs (for example, Sequencher™, Gene Codes Corp.) are used 
to look for overlap between the various sequences, resulting in ordering of contig 
sequences and ultimately providing the complete sequence of the entire bacteriophage 

25 genome (one such example is given in Table 2 for Staphylococcus aureus 

bacteriophage 77; others are also provided herein). This complete nucleotide 
sequence is preferably determined with a redundancy of at least 3- to 5-fold (number 
of independent sequencing events covering the same region) in order to minimize 
sequencing errors. 

30 Preferably, the bacterial strain used as a phage host should not possess any 

other innate plasmids, transposons, or other phage or incompatible sequences that 
would complicate or otherwise make the various manipulations and analyses more 
difficult 

Commercially available computer software programs are used to translate the 
35 nucleotide sequence of the phage to identify all protein sequences encoded by the 
phage (hereafter called open reading frames or ORFs). (Customized software can 
clearly also be used.) As phages are known to transcribe their genome into RNA from 
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both strands, in both directions, and sometimes in more than one frame for the same 
sequence, this exercise is done for both strands and in all six possible reading frames. 
As evolutionary constraints have forced the phage to conserve all of its vital protein 
sequences in as small a genome as possible, it is straightforward to identify all the 
5 proteins encoded by the phage by simple examination of the 6 translation frames of 
the genome. Once these ORFs are identified, they are cataloged into a phage 
proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also 
provided for other exemplary phage). This analysis is preferably performed for each 
phage under study. The process of ORF identification can be varied depending on the 

10 desired results. For example, the minimum length for the putative encoded 

polypeptide can be varied, and/or putative coding regions that have an associated 
Shine-Dalgarno sequence can be selected. In the case of phage 77 ORFs, such 
parameter adjustment was performed and resulted in the identification of ORFs as 
listed herein. Different parameters had resulted in the identification of the ORFs 

15 listed in the preceding U.S. Provisional Application 60/1 10,992, filed December 3, 
1998, which is hereby incorporated by reference in its entirety. 

Exemplary phage 77 ORFs identified in that provisional application and as 
identified herein are shown in the following table: 



ORFID 
from 

60/110,992 


Genomic 
position 


a.a. 
size 


Start 
codon 


ORPID 

from 

241/190 


Genomic 
position 


a.a. 
size 


Start 
codon 


77ORF016 


2369-24024 


251 


TTG 


77ORF017 


23269-23982 


237 


ATG 


77ORF019 


39845-40501 


218 


ATA 


77ORF019 


39851-40501 


216 


ATG 


77ORF050 


29268-29564 


98 


ATG 


770RF182 


29268-29564 


98 


ATG 


77ORF050 


29268-29564 


98 


ATG 


77ORF043 


29304-29564 


86 


ATG 


77ORF067 


34312-34551 


79 


CTG 


77ORF104 


34393-34551 


52 


ATG 


770RF146 


29051-29212 


53 


ATG 


77ORF102 


29051-29212 


53 


ATG 



20 

Identifying and Characterizing Inhibitory Phage ORFs 

The fourth step entails identifying the phage protein or proteins or RNA 
transcripts that have the ability to inhibit their bacterial hosts. This can be 

25 accomplished, for example, by either or both of two non-mutually exclusive methods. 
The first method makes use of bioinformatics. Over the past few years, a large amount 
of nucleotide sequence information and corresponding translated products have 
become available through large genome sequencing projects for a variety of 
organisms including mammals, insects, plants, unicellular eukaryotes (yeast and 

30 fimgi), as well as several bacterial genomes such as E. coli, Mycobacterium 

tuberculosis, Bacillus subtilis y Staphylococcus aureus and many others. Such 

sequences have been deposited in public databases (for example, non-redundant 
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sequence database at GenBank and SwissProt protein sequence database) 
(http://www.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific 
query sequence to those present in such databases. For example, GenBank contains 
over 1.6 billion nucleotides corresponding to 2.3 million sequence records. Several 
5 computer programs and servers (e.g., TBLASTN) have been created to allow the rapid 
identification of homology between any given sequence from one organism to that of 
another present in such databases, and such programs are public and available free of 
charge. 

In addition, it has been well established that basic biochemical pathways can 

10 be conserved in very distant organisms (for example bacteria and man), and that the 
proteins performing the various enzymatic steps in these pathways are themselves 
conserved at the amino acid sequence level. Thus, proteins performing similar 
functions (e.g. DNA repair, RNA transcription, RNA translation) have frequently 
preserved key structural signatures, identifiable by similarities across regions of 

15 proteins (domains and motifs). The antimicrobials of the present invention will 
preferably target features and targets that are highly characteristic or conserved in 
microbes, and not higher organisms. 

Most genomes encode individual proteins or groups of proteins that can be 
assembled into protein families that have been evolutionarily conserved. Therefore, 

20 similarity between a new query sequence and that of a member of a protein family 
(reference sequences from public databases) can immediately suggest a biochemical 
function for the novel query sequence, which in our case is a phage ORF. 

The sequence homology between individual members of evolutionarily distant 
members of a protein family is usually not randomly distributed along the entire 

25 length of the sequence but is often clustered into "motifs" and "domains". These 
correspond to key three-dimensional folds that form key catalytic and/or regulatory 
structures that perform key biochemical function(s) for the group of proteins. 
Commercially available computer software programs can identify such motifs in a 
new query sequence, again providing functional information for the query sequence. 

30 Such structural and functional motifs have also been derived from the combined 
analysis of primary sequence databases (protein sequences) and protein structure 
databases (X-ray crystallography, nuclear magnetic resonance) using so-called 
"threading" methods (Rost B,l and Sander C. (1996)^4™. Rev. Biophy. Biomol 
Struct. 25,113-136). 

35 Such motifs and folds are themselves deposited in public databases which can 

be directly accessed (for example, SwissProt database; 3D-ALI at EMBL, Heidelberg; 
PROSITE). This basic exercise leads to a structural homology map in which each of 
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the phage ORFs has been probed for such similarities, and where initial structural and 
functional hits are identified (selected examples of sequence homologies detected 
between individual ORFs from the genome of Staphylococcus aureus bacteriophage 
77 and sequences deposited in public databases are shown in Table 5 for ORFs 

5 17/19/43/102/104/182). 

This analysis can point out phage proteins with similarity to proteins from 
other phages (such as those for E. coli) playing an important role in the basic 
biochemical pathways of the phage (such as DNA replication, RNA transcription, 
tRNAs, coat protein and assembly). Selected examples of such proteins include 

10 integrase and capsid protein. Therefore, this analysis enables identification and 

elimination of non-essential ORFs as candidates for an inhibitor function, as well as 
the identification of (potentially) useful ones. 

In addition, this analysis can point out specific ORFs as possible inhibitor 
ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial 

15 cell structure, metabolism or physiology, and ultimately viability. Examples of such 
proteins present in the genome of Staphylococcus aureus bacteriophage 77 include 
orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orfl5 (sialidase). 
(These ORF identifications are as listed in provisional application 60/1 10,992.) Other 
examples include ORFs 9 and 12 of S. aureus phage 44 AHJD, which encode the 

20 putative lysis functions found in many bacteriophages - a "holin" and an "amidase". 

In addition, it is well known that bacterial and eukaryotic viruses can usurp 
pathways from their host in order to use them to their advantage in blocking host 
cellular pathways upon infection. The phage can achieve this by 1) directly producing 
an inhibitor of a key host pathway (e.g. T7 gene 0.5 and 2), 2) directly producing a 

25 novel activity (e.g. T4 DNA polymerase), and 3) altering concentrations of cell 
components by producing similar functions (e.g. T4 transfer RNAs). The 
identification of sequence similarity between phage ORFs and bacterial host genome 
sequences will be highly indicative of such a mechanism. (Selected examples of such 
homologies are listed in Figure 4 of the provisional application 60/1 10,992 and 

30 include orf4 (homologous to autolysin), orf20 (hypothetical protein from 

Staphyloccus aureus) and orf29 (hypothetical protein from Staphyloccus aureus.)) 
These ORFs can be analyzed by a standard biochemical approach to directly test their 
inhibitor functions {e.g., as described below). 

Alternatively, a homology search may reveal that a given phage ORF is related 

35 to a protein present in the databases having an activity known to be inhibitory, (e.g. 
inhibitor of host RNA polymerase by E. coli bacteriophage T7. Such a finding would 
implicate the phage ORF product in a related activity. This will also suggest that a 
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new antimicrobial could be derived by a mimetic approach (e.g., peptidomimetic) 
imitating this function or by a small molecule inhibitor to the bacterial target of the 
phage ORF, or any steps in the relevant host metabolic pathway, e.g., high throughput 
screening of small molecule libraries. Selected examples of such similarity between 
5 ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions 
for bacterial hosts are listed in Figure 4 of the provisional application 60/1 10,992. 
These include orf9 (similar to bacteriophage PI kilA function), and orf4 (autolysin of 
Staphylococcus aureus, amidase enzymatic activity). 

A reason for the biochemical study of individual ORFs for inhibitor function is 

10 that their expression or overexpression will block cellular pathways of the host, 
ultimately leading to arrest and/or inhibition of host metabolism. In addition, such 
ORFs can alter host metabolism in different ways, including modification of 
pathogenicity. Therefore, individual ORFs identified above are expressed, preferably 
overexpressed, in the host and the effect of this expression or overexpression on host 

15 metabolism and viability is measured. This approach can be systematically applied to 
every ORF of the phage, if necessary, and does not rely on the absolute identification 
of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the 
phage genomic DNA, e.g., by the polymerase chain reaction (PCR), preferably using 
oligonucleotide primers flanking the ORF on either side. These single ORFs are 

20 preferably engineered so that they contain appropriate cloning sites at their extremities 
to allow their introduction into a new bacterial expression plasmid, allowing 
propagation in a standard bacterial host such as E. coli, but containing the necessary 
information for plasmid replication in the target microbe such as S. aureus (hereafter 
referred to as shuttle vector). Shuttle vectors and their use are well known in the art. 

25 Such shuttle vectors preferably also contain regulatory sequences that allow 

inducible expression of the introduced ORF. As the candidate ORF may encode an 
inhibitor function that will eliminate the host, it is beneficial that it not be expressed 
prior to testing for activity. Thus, screening for such sequences when expressed in a 
constitutive fashion is less likely to be successful when the inhibitor is lethal. In the 

30 exemplary inducible system presented in Figure 1 A, IB, 2, and 7, regulatory 
sequences from the ars operon of S. aureus are used to direct individual ORF 
expression in S. aureus (or other bacteria in which the ars system is functional). The 
ars operon encodes a series of proteins which normally mediate the extrusion of 
arsenite and other trivalent oxyanions from the cells when they are exposed to such 

35 toxic substances in their environment. The operon encoding this detoxifying 

mechanism is normally silent and only induced when arsenite-related compounds are 
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present. (Tauriainen, S. et al. (1997) App. Env. Microbe Vol. 63, No. 1 1, p. 4456- 
4461.) 

Therefore, individual phage ORFs can be expressed in S. aureus in an 
inducible fashion by adding to the culture medium non-toxic arsenite concentrations 
5 during the growth of individual S. aureus clones expressing such individual phage 
ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or 
arrest of growth under induction conditions, as measured by optical density in liquid 
culture or after plating the induced cultures on solid medium. Subsequently, 
interference of the phage ORF with the host biochemical pathways ultimately leading 
10 to reduced or arrested host metabolism can be measured by pulse-chase experiments 
using radiolabeled precursors of either DNA replication, RNA transcription, or protein 
synthesis. Similar constructs can be made and used for other bacteria using well- 
known techniques. 

Those skilled in the art are familiar with a variety of other inducible systems 
1 5 which can also be used for the controlled expression of phage ORFs, including, for 
example, lactose (see e.g., Stratagene's LacSwitch™II system; La Jolla, CA) and 
tetracycline-based systems (see, e.g. Clontech's Tet On/Tet Off™ system; Palo Alto, 
CA). The arsenite-inducible system described is further depicted in Figures 1, 2 and 7. 
The selection or construction of shuttle vectors and the selection and use of 
20 inducible systems are well known and thus other shuttle vectors appropriate for other 
bacteria can be readily provided by those skilled in the art, e.g., for use in other 
bacterial species. 

Standard methodologies for expressing proteins from constructs, and isolating 
and manipulating those proteins, for example in cross-linking and affinity 

25 chromatography studies, may be found in various commonly available and known 
laboratory manuals. See, e.g., Current Protocols in Protein Science, John Wiley & 
Sons, Secaucus, N.J., and Maniatis, T. et al. (1989} Molecular Cloning: A Laboratory 
Manual . Cold Spring Harbor University Press, Cold Spring, N.Y. 

It has been found that certain phage or other viruses inhibit host cells, at least 

30 in part, by producing an antisense RNA which binds to and inhibits translation from a 
bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts 
encoded by the phage genome, a strong indicator of a possible inhibitory function is 
provided by the identification of phage sequence which is the identical to or fully 
complementary (or with only a small percentage of mismatch, e.g., <10%, preferably 

35 less than 5%, most preferably less than 3%, to a bacterial sequence. This approach is 
convenient in the case of bacteria that have been essentially completely sequenced, as 
the comparison can be performed by computer using public database information. 
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The inhibitory effect of the transcript can be confirmed using expression of the 
phage sequence in a host bacterium. If needed, such inhibitory can also be tested by 
transfecting the cells with a vector that will transcribe the phage sequence to form 
RNA in such manner that the RNA produced will not be translated into a polypeptide. 
5 Inhibition under such conditions provides a strong indication that the inhibition is due 
to the transcript rather than to an encoded polypeptide. 

In an alternative, the expression of an ORF in a host bacterium is found to be 
inhibitory, but the inhibition is found to be due to an RNA product of the genomic 
coding region. For antisense inhibition, the sequence of the bacterial target nucleic 

10 acid sequence can be identified by inspection of the phage sequence, and the full 
sequence of the relevant coding region for the bacterial product can be found from a 
database of the bacterial genomic sequence or can be isolated by standard techniques 
(e.g., a clone in a genomic library can be isolated which contains the full bacterial 
ORF, and then sequenced). 

15 In either case, the identification of a target which is inhibited by an RNA 

transcript produced by a phage provides both the possible inhibition of bacteria 
naturally containing the same target nucleic acid sequence, as well as the ability to use 
the target sequence in screening for other types of compounds which will act directly 
on the target nucleic acid sequence or on a polypeptide product expressed or 

20 regulated, at least in part, by the target of the inhibitory phage RNA. 

In some cases it will be found that the target of an inhibitory phage RNA or 
protein has previously been found to be a target of an inhibitory phage RNA or 
protein has previously been found to be a target for an antibacterial agent. In such 
cases, the phage inhibitor can still provide useful information if it is found that the 

25 phage-encoded product acts at a different site than the previously identified 

antibacterial agent or inhibitor, i.e., acts at a phage-specific site. For many targets, 
action at a different site provides highly beneficial characteristics and/or information. 
For example, an alternate site of inhibitor action can at least partially overcome a. 
resistance mechanism in a bacterium. As an illustration, in many cases, resistance is 

30 due, in large part, to altered binding characteristics of the immediate target to the 
antibacterial agent. The altered binding is due to a structural change which prevents 
or destabilizes the binding. However, the structural change is frequently quite local, 
so that compounds which bind at different local sites will b unaffected or affected to a 
much lesser degree. Indeed, in some cases the local sites will be on a different 

35 molecule and so may be completely unaffected by the local structural change creating 
resistance to the original agent(s). An example of resistance due to altered binding is 

44 



provided by methicillin-resistant Staphylococcus aureus, in which the resistance is 
due to an altered penicillin-binding protein. 

In other cases, a new site of action can have improved accessibility as 
compared to a site acted on by a previously identified agent. This can, for example, 
5 assist in allowing effective treatment at lower doses, or in allowing access by a larger 
range of types of compounds, potentially allowing identification of more potential 
active agents. 

Another advantage is that the structural characteristics of a different site of 
action will lead to identification and/or development of inhibitors with different 
10 structures and different pharmacological parameter. This can allow a greater range of 
possibilities when selecting an antibacterial agent. 

Yet further, different sites often produce different inhibitory characteristics in 
the target organism. This is commonly the case for multi-domain target proteins. 
Thus, inhibition targeting an alternate site can produce more efficacious action, e.g., 
1 5 faster killing, slower development of resistance, lower numbers of surviving cells, and 
different secondary effects (for example, different nutrient utilization). 

Staphylococcus aureus phage 77 

As indicated above, the present invention is concerned, in part, with the use of 

20 bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts 
to identify bacterial targets for potential new antibacterial agents. 

As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found 
to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 
182 and products from the phage which inhibit the host bacterium both provides an 

25 inhibitor compound and allows identification of the bacterial target affected by the 
phage-encoded inhibitor. Such a target is thus identified as a potential target for 
development of other antibacterial agents or inhibitors and the use of those targets to 
inhibit those bacteria. As indicated above, even if such a target is not initially, 
identified in a particular bacterium, such a target can still be identified if a 

30 homologous target is identified in another bacterium. Usually, but not necessarily, 
such another bacterium would be a genetically closely related bacterium. Indeed, in 
some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can 
also inhibit such a homologous bacterial cellular component. 

Possible bacterial target sequences are described herein by reference to sequence 
35 source sites. In preferred embodiments, the sequence encoding the target "Corresponds 
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to a S. aureus nucleic acid sequence available from numerous sources including S. 
aureus sequences deposited in GenBank, S. aureus sequences found in European 
Patent Application No. 971001 10.7 to Human Genome Sciences, Inc. filed January 7, 
1997, S. aureus sequences available from TIGR at 
5 http ://www, t i er. ore/tdb/mdb/mdb .h tml , and S. aureus sequences available from the 
Oklahoma University £ aureus sequencing project at the following URL: 
http://www. genome.ou.edu/staph new.html . Such possible targets are particularly 
applicable to S aureus phages 77, 3A, 96, and 44 AHJD. 

The amino acid sequence of a polypeptide target is readily provided by 

10 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a S. aureus coding sequence corresponding to a sequence listed in 
Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed 
with GenBank. Again, for the sake of brevity, the sequences are described by 

15 reference to the database accession numbers instead of being written out in full herein. 
In cases where an entry for a coding region is not complete, the complete sequence 
can be readily obtained by routine methods, e.g., by isolating a clone in a phage host 
S. aureus genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 

20 sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

Staphvloccus aureus phage 44 AHJD 
25 The present invention also can utilize the identification of naturally occuring 

DNA sequence elements within Staphylococcus aureus bacteriophage 44 AHJD which 

encode proteins with antimicrobial activity. 

Such identification can utilize bioinformatics identification of specific proteins 

(ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life 

30 cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of 
the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of 
the bacteriophage 44AHJD DNA sequences encoding these proteins (ORFs) are 
predicted to encode antimicrobial functions. Information derived from these DNA 
sequences and translated ORFs can, in turn, be utilized to develop inhibitory 

35 compounds by peptidomimetics that can also function as antimicrobials. In addition, 
the identification of the host bacterial proteins that are targeted and inhibited by the 



antimicrobial bacteriophage ORFs can themselves provide novel targets for drug 
discovery. 

The methodology described above is used to identify and characterize DNA 
sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial 
5 activity. As described in the Examples, the Staphylococcus aureus propagating strain 
(PS 44 A), obtained from the Felix d'Herelle Reference Centre (#HER 1 101), was 
used as a host to propagate its phage 44AHJD, also obtained from the Felix d'Herelle 
Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44AHJD 
consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino 
10 acids (Tables 17 & 18). Computational analysis of the predicted protein products of 
Staphylococcus aureus bacteriophage 44 AHJD identified homolgs in public sequence 
databases as listed inTable 19 and 20, along with the accompanying list of related 
proteins. 

From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to 

» 

1 5 structural proteins found in other bacteriophages. These include genes predicted to 
encode a tail protein (ORF 3), an upper collar/connector protein of the phage virion 
(ORF 7), and a lower collar protein (ORF 8). Bioinformatics has also identified one 
gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) 
shows significant homology to DNA polymerases of a number of bacteriophages, 

20 bacteria and fungi, and the product of this gene is likely responsible for replicating 
the genetic material of bacteriophage 44 AHJD. ORF 2 encodes a protein with 
homology to the dinC gene of Bacillus subtilis that encodes a protein involved in 
teichoic acid biosynthesis. Teichoic acid is a polyphosphate polymer found in some, 
but not all, Gram positive organisms (and not in Gram negative organisms), where it 

25 is attached to the peptidoglycan layer. The phage protein may thus be involved in the 
synthesis of this material for incorporation into the cell wall, allowing enhanced lysis 
by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", 
may be involved in its degradation allowing for penetration of the peptidoglycan and 
phage genome entry into the cell following adsorption. The similarity between 

30 Staphylococcus aureus bacteriophage 44 AHJD and £. coli phage T7 indicates that 
they may share similar mechanisms of replication and growth. Both phages belong to 
the Pododviridae Family of bacteriophages and are members of the 'T7-like" Genus 
of this Family (Ackermann and DuBow; Vlth ICTV Report). 



Two genes, ORF 9 and 12, were identified with the potential to encode 
antimicrobial protein products. The homology alignments are shown in Tables 19 and 
20. The predicted product of ORF 9 is related to a class of genes which encodes 
lysozyme-like functions, enzymes which cleave linkages in the mucopolysaccharide 
cell wall structure of a variety of micro-organisms, including that from the 
Staphylococcus aureus bacteriophage Twort. ORF 12 of Staphylococcus aureus 
bacteriophage 44AHJD shows homology to a set of lysis proteins from several 
bacteriophages. These lysis proteins are also referred to as holins, and represent 
phage-encoded lysis functions required for transit of the phage murein hydrolases 
(lysozyme) to the periplasm, where it can digest the cell wall and thus lyse the 
bacterium. 

Thus, in particular embodiments, the present invention provides a nucleic acid 
sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at 
least a portion of one of the genes described above with antimicrobial activity. For 
example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize 
host-derived accessory proteins for its activity when replicating the phage template, 
sequestering such proteins from use by the bacterial polymerase, resulting in 
inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 
directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to 
encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 
likely encodes a holin function required for transit of the phage amidase (gene 9 
product) to the periplasm. When this type of gene product from Bacillus phage phi 29 
(gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al., 1993). 
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 
cell death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al., 1993). 

The present invention also provides the use of the Staphylococcus 
bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological 
agents, either wholly or in part and derivatives, as well as the use of corresponding 
peptidomimetics, developed from amino acid or nucleotide sequence knowledge 
derived from Staphylococcus bacteriophage 44 AHJD killer ORFs. 
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Enterococcus phage 1 82 

Bacteriophage 182 was. obtained from the Felix D'Herelle phage collection 

(Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of 

5 Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to 

encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational 

analysis of the predicted protein products of Enterococcus bacteriophage 182 was 

performed in order to identify protein products related to those deposited in public 

databases. Bacteriophage 182 protein products which detected sequences with 

10 significant sequence similarity in public databases are listed in Table 24 and 26, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 
. 011) are related to structural proteins of several Bacillus phages - Bacillus 
bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail 

15 protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a 
lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 011). Two 
gene products are predicted to encode genes which direct phage morphogenesis - 
these are ORF 005 and 019. 

Bioinformatics has also identified three genes whose products are likely 

20 involved in phage DNA synthesis. One gene, ORF 002 shows significant homology to 
DNA polymerases of a number of bacteriophages, and the product of this gene is 
likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 
encodes a protein with homology to the encapsidation proteins of several other 
bacteriophages, including Bacillus phage phi-29 (PI 1014), PZA (P07541), and B103 

25 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the 
in vivo and in vitro genome-encapsidation reaction (Garvey et al., 1985). Proteins 
involved in genome packaging have been shown to have additional activities that 
affect biochemical reactions in other phages and their hosts. For example, the coat 
protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally 

30 repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction 
also plays a role in genome encapsidation, enveloping a single copy of the viral 
genome in a protein shell composed of many molecules of coat protein. In addition, 
the bacteriophage X terminase enzyme can be lethal to E. coli when expressed, 
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suggesting cleavage of packaging sites in the bacterial chromosome. Also present 
within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to 
the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-1 (X96987) 
and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends 
5 of both strands of the genome and are essential for DNA replication playing a role in 
initial priming of DNA replication. The similarity between Enterococcus 
bacteriophage 182 and Bacillus phages phi-29, PZA, and B 103 indicates that they 

may share similar mechanisms of replication and growth. Protein-primed DNA 

-I 

replication is a well described phenomenon, and in the phi-29-like phages, the ends of 
10 the DNA serve as origins and termini of replication (Gutierrez et al., 1986; Yoshikawa 
et al., 1985). 

There is also a gene (ORF 015) that encodes a protein showing homology to 
an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic 
acid binding protein of bacteriophage B103. 

15 Two genes, ORF 008 and 014, were identified with the potential to encode 

anti-microbial protein products. The homology alignments are shown in Tables 24 & 
26 and biochemical features of the predicted polypeptides shown in Table 25. The 
predicted product of ORF 008 is related to a class of genes which encodes lysozyme- 
like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall 

20 structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows 
homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and 
B103. These lysis proteins are also referred to as holins and represent phage encoded 
lysis functions required for transit of the phage murein hydrolases (lysozyme) to the 
periplasm, where it can digest the outer cell wall and thus lyse the bacterium. 

25 Thus, the present invention provides a nucleic acid sequence obtained from 

Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, 
preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 002 encodes a 
DNA polymerase function. This polymerase may utilize host-derived accessory 

30 proteins for its activity when replicating the phage template, sequestering such 
proteins from use by the bacterial polymerase, resulting in inhibition of DNA 
replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly 
encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an 
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autolytic lysozyme, a protein known to have anti-microbial activity (Martin et ai, 
1998). ORF 014 likely encodes a holin function required for transit of the phage 
murein hydrolases to the periplasm. When the related product from Bacillus phage phi 
29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al, 1993). 
5 Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 
cell death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al, 1993). 

10 The present invention also provides the use of the Enterococcus bacteriophage 

182 anti-microbial ORFs as pharmacological agents, either wholly or in part and 
derivatives, as well as the use of corresponding peptidomimetics, developed from 
amino acid or nucleotide sequence knowledge derived from Enterococcus 
bacteriophage 182 killer ORFs. This can be done where the structure of the 

15 peptidomimetic compound corresponds to the structure of the active portion of a 
product of an ORF. In this analysis, the peptide backbone is transformed into a carbon 
based hydrophobic structure that can retain cytostatic or cytocidal activity for the 
bacterium. This is done by standard medicinal chemistry methods, measuring growth 
inhibition of the various molecules in liquid cultures or on solid medium. These 

20 mimetics also represent lead compounds for the development of novel antibiotics. In 
this context, "corresponds" means that the peptidomimetic compound structure has 
sufficient similarities to the structure of the active portion of a product of one of the 
Enterococcus ORFs listed, that the peptidomimetic will interact with the same 
molecule as the product of the ORF, and preferably will elicit at least one cellular 

25 response in common which relates to the inhibition of the cell by the phage protein. 

To validate the identity of an ORF as a killer ORF, it is preferably expressed 
in the host or other test bacterial organism and the effect of this expression on 
bacterial growth and replication is assessed. Therefore, all individual ORFs identified 
herein, e.g., those identified above, can be expressed, preferably overexpressed, in a 

30 suitable host bacterium e.g., a host Enterococcus and the effect of this expression or 
overexpression on host metabolism and viability can be measured. 

Individual ORFs can be resynthesized from the phage genomic DNA by the 
polymerase chain reaction (PCR) using oligonucleotide primers flanking the ORF on 



either side. Those skilled in the art are familiar with the design and synthesis of 
appropriate primer sequences. These single ORPs are preferably engineered so that 
they contain appropriate cloning sites at their extremities to allow their introduction 
into a new bacterial expression plasmid, allowing propagation in a standard bacterial 
5 host such as E. coli, but containing the necessary information for plasmid replication 
in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector). 

This shuttle vector also preferably contains regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode a 
killer function that will eliminate the host, it is highly advantageous that it not be 

10 expressed (or at least not expressed at a substantial level) prior to testing for activity; 
thus screening for such sequences in a constitutive fashion is less likely to be 
successful (lethality). In an example presented in Fig. 7, regulatory sequences from 
the ars operon are used to direct individual ORF expression in Enterococcus. The ars 
operon encodes a series of proteins which normally mediate the extrusion of arsenite 

15 and several other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying 
mechanism is normally silent and only induced when arsenite-related compounds are 
present. 

Therefore, individual phage ORFs can be expressed in Enterococcus or other 
20 suitable host in an inducible fashion by adding to the culture medium' non-toxic 
arsenite concentrations during the growth of individual Enterococcus (or other host 
cells) clones expressing such individual phage ORFs. Toxicity of the phage killer 
ORF for the host is monitored by reduction or arrest of growth under induction 
conditions, as measured by optical density in liquid culture or after plating the 
25 induced cultures on solid medium. Subsequently, interference of the phage ORF with 
the host biochemical pathways ultimately leading to reducing or arresting host 
metabolism can be measured by pulse chase experiments using radiolabeled 
precursors of either DNA replication, RNA transcription, or protein synthesis. 

Of course, other inducible regulatory sequences (e.g., promoters, operators, 
30 etc.) may be used (e.g., systems using positive induction of expression or systems 
using release of repression). A variety of such systems are known to those skilled in 
the art and can be utilized in the present invention. 
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Nucleic acid sequences of the present invention can be isolated using a method 
similar to those described herein or other methods known to those skilled in the, art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Having the phage 182 ORFs, e.g., anti-bacterial ORFs of the present 
5 invention, portions thereof, or oligonucleotides derived therefrom as described, other 
anti-microbial sequences from other bacteriophage sources can be identified and 
isolated using methods described here or other methods, including methods utilizing 
nucleic acid hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage anti-microbial DNA segments from 

10 other phages based on nucleic acids and sequences hybridizing to the presently 

identified inhibitory ORF under high stringency conditions or sequences which are 
highly homologous. The bacteriophage anti-microbial DNA segment from 
bacteriophage 182 can be used to identify a related segment from another unrelated 
phage based on stringent conditions of hybridization or on being a homolog based on 

15 nucleic acid and/or amino acid sequence comparisons. As with the phage 182 

inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

Enterococcus sequences are listed in Table 27 by accession number, providing 

20 identification of possible targets of Enterococcus phage inhibitory ORF products, e.g., 
from phage 182. 

Streptococcus pneumoniae 

As indicated in the Summary above, the present invention is concerned 

25 with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the 

encoded polypeptides or RNA transcripts to identify bacterial targets for potential new 

antibacterial agents. 

Streptococcus pneumoniae is an important cause of community-acquired 
pneumonia and a major cause of otitis media, sinusitis, and meningitis in children and 
30 adults. In Spain and other Mediterranean countries, the majority of 5. pneumoniae are 
relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgensen et al., 
1990). These strains also have decreased susceptibility to broad-spectrum 
cephaloporins, which are frequently used in the empiric treatment of meningitis and 
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other serious invasive bacterial infections. High-level resistance of pneumococci has 
been encountered in Hungary where 70% of children who were colonized with S. 
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, 
erythromycin, trimethoprim/sulfamethoxazole, and 30% resistant to chloramphenicol 
5 (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin 
averages 20-25% in France, -20% in Japan, and <10% in Spain (Neu, 1 992). 

The antimicrobial susceptibilities and distribution of serotypes of the 42 
isolates of S. pneumoniae in southern Taiwan from invasive infections have been 
recently determined (Hseuh et al., 1996). Resistance rates among these isolates were: 

10 erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 
73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the 
isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the 
infections and mortality was 42.6%. Given the severity of these infections despite 
adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic 

1 5 options to prevent mortality due to invasive S. pneumoniae infections. 

Pneumococcal phages belong to four families and they present a great variety 
in morphology, including lytic and temperate phages (for a review, see Garcia et al., 
1997). Examples of lytic phages are Cp-1 and Dp-1, whereas examples of temperate 
phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and 

20 functional organization of Cp-1 has been reported (Martin et al., 1996). Cp-1 has a 
19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to 
its 5' ends, that replicates by a protein primed mechanism. The phage contains 29 
ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were 
compared to sequences compiled in GenBank EMBL databases, to ORFs showed 

25 significant similarity to proteins of bacteriophage 29 that infects 5. subtilis (Martin et 
al., 1996). The similar proteins corresponded to those involved in DNA replication 
(terminal protein and DNA polymerase), structural and morphogenic proteins (major 
head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis 
function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts 

30 itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan. 
Expression of the Cp-1 holin protein in E. coli results in cell death after 2 hours of 
induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid 
construction with holin and lysozyme genes together did lyse after induction and the 
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viability loss was similar to that of the culture expressing holin alone. Cloning of 
these lytic genes in S. pneumoniae showed that both genes had the same effect as in E. 
coli. That is, holin itself did not lyse the culture but the viability loss was noticeable, 
whereas both holin and lysozyme together were capable of lysing M31, an amidase 
5 deleted mutant (Garcia et al., 1997). 

Recently, a small portion (-4 kbp) of a second S. pneumoniae phage, Dp-1, 
has been sequenced (Sheehan et al., 1997). This portion contains the genes coding for 
the lytic system (Sheehan et al., 1997) and shows a modular organization similar to 
that described for Cp-1. However, in this case, a single chimeric protein appears to be 

10 made in which the N-terminal domain is highly similar to that of the murein hydrolase 
coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the C- 
terminal domain is homologous to holins. Thus, both functions appear to have been 
combined in a novel chimeric protein. 

Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de 

15 Microbiologia Molecular, Centro de Departamento de Investigaciones Biologicas, 
Consejo Superior de Investigaciones Cientificas, Velazquez, Madrid, Spain). We 
found that Dp-1 has a double-stranded DNA genome of 56,506 bp, predicted to 
encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgarno 
motifs for translation initiation (Tables 28 & 30, and Fig. 6). Computational analysis 

20 of the predicted protein products of Streptococcus bacteriophage Dp-1 protein 
products, which detected homologs in public databases, are listed inTable 31, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that several predicted genes of Dp-1 encode 
polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are 

25 predicted to encode tail proteins, minor structural proteins, and minor capsid proteins 
(Table 31). We also note the identification of several gene products that are likely 
involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, 
ORF 8 which encodes a SWI/SNF helicase-related protein, ORF 10 encodes a protein 
showing homology to recA, and ORF 13 encodes a dnaZX-like ORF. 

30 In E. coli, RapA encodes an RNA polymerase (RNAP)-associated protein with 

ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of 
proteins whose members are involved are involved in transcription activation, 
nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, 
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as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves 
similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation 
of the essential E. coli dnaZX results in a block in DNA chain elongation during 
replication (Maki et al., 1988). The dnaZX gene has only one open reading frame for 
5 a 71-kDa polypeptide from which the two distinct DNA polymerase HI holoenzyme 
subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the 
precursor of the gamma subunit, and the gamma subunit is produced by a -1 
frameshift causing early termination of translation (Tsuchihashi et al., 1990). These 
proteins show single-strand DNA binding properties that is ATPase (and dATPase) 

10 dependent and are thought to increasing the processivity of the core DNA polymerase 
enzyme (Lee et al., 1987). 

There are several Dp-1 ORFs which encode proteins predicted to play a role in 
cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ 
synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently 

15 bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of 
Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose 
sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S 
regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon 
may be involved in a contact-mediated translocation mechanism to transfer anti-host 

20 factors directly into eukaryotic cells disrupting eukaryotic signal transduction through 
ADP-ribosylation (Frank, 1997). 

There is also a protein with similarity to GTP cyclohydrolase I (ORF 21) and 
ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an 
enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the 

25 pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption 
of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional 
lethality due to folinic acid auxotrophy, that can be complemented with the 
mammalian or bacterial GTP cyclohydrolase I enzymes (Nardese et al., 1996; Mancini 
etal, 1999). 

30 ORF 16 shows high homology to autolysin. This region of the phage sequence 

was previously reported (Sheehan et al., 1997) and encompasses ~ 4 kbp of our 
sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32. 

Thus, the present invention provides a nucleic acid sequence obtained from 
Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp-1 ORF, 

35 preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 013 encodes a 
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protein with homology to the gamma subunit of DNA polymerase (dnaX gene). This 
protein may act in a dominant-negative fashion to sequester the host DNA polymerase 
for its own replication, thus inhibiting host DNA replication. The dnaX gene product 
is essential for E. coli replication (Kodaira et ah, 1983). 

In certain preferred embodiments of the present invention, the bacterial target of 
a bacteriophage inhibitor ORF product, e.g., an inhibitory protein or polypeptide, is 
encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for 
bacteriophage Dp-1. As above, possible target sequences are described herein by 
reference to sequence source sites. The sequence encoding the target preferably 
corresponds to a Streptococcus nucleic acid sequence available from The Institute for 
Genomic Research (TIGR), or available from GenBank or other public database. The 
TIGR Streptococcus sequences are publicly available at The Institute for Genomics 
Research at URL: http://www.tigr.org 

The amino acid sequence of a polypeptide target is readily provided by 
translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a Streptococcus pneumoniae coding sequences conresponding to a 
sequence listed in Table 33 herein. Sequences for other Streptococcal species are also 
available from TIGR and./or from GenBank. The listing in Table 33 describes 
Streptococcus sequences currently deposited in GenBank. Again, for the sake of 
brevity, the sequences are described by reference to the GenBank entries instead of 
being written out in full herein. In cases where the TIGR or GenBank entry for a 
coding region is not complete, the complete sequence can be readily obtained by 
routine methods, e.g., by isolating a clone in a phage Dp-1 host Streptococcus sp. 
genomic library, and sequencing the clone insert to provide the relevant coding 
region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

In the various aspects of this invention involving Dp-1 sequences, preferably the 
sequence is preferably not contained in the sequence described in Sheehan et al., 1997 
(Table 32). 

Validating Identified Inhibitory Phage ORFs 

A fifth step involves validating the identified phage inhibitor ORF by 
independent methods, and delineating further possible smaller segments of the ORFs 
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that have inhibitory activity. Several methods exist to validate the role of the 
identified ORF as an inhibitor ORF. 

One example utilizes the creation of a mutant variant of the phage ORF in 
, which the candidate ORF carries a partial or complete loss-of-function mutation that 

5 is measurable as compared with the non-mutant ORF. Comparison of the effects of 
expression of the loss of function mutant with the normal ORF provides confirmation 
of the identification of an inhibitor ORF where the loss-of-function mutant provides a 
measurably lower level of inhibition, preferably no inhibition. The loss of function 
may be conditional, e.g., temperature sensitive. 

10 Once validation of the inhibitor ORF is achieved, a bi-directional deletion 

analysis can be carried out using the same experimental system to identify the 
minimal polypeptide segment that has inhibitor activity. This may be carried out by a 
variety of means, e.g., by exonuclease or PCR methodologies, and is used to 
determine if a relatively small segment of the ORF (i.e., the product of the ORF) still 

15 possesses inhibitory activity when isolated away from its native sequence. If so, a 
portion of the ORF encoding this "active portion" can be used as a template for the 
synthesis of novel anti-microbial agents and further allowing derivation of the peptide 
sequence, e.g., using modified peptides and/or peptidomimetics. 

In creation of certain peptidomimetics, the peptide backbone is transformed 

20 into a carbon-based hydrophobic structure that can retain inhibitor activity against the 
bacterium. This is done by standard medicinal chemistry methods, typically 
monitored by measuring growth inhibition of the various molecules in liquid cultures 
or on solid medium. These mimetics can also represent lead compounds for the 
development of novel antibiotics. 

25 . Recently, a major effort has been undertaken by the pharmaceutical industry 

and their biotechnology partners for the sequencing of bacterial pathogen genomes. 
The rationale is that the systematic sequencing of the genome will identify all of the 
bacterial proteins and therefore this proteome will be the target for designing novel 
inhibitor antibiotics. Although systematic, this approach has several major problems. 

30 The first is that analysis of primary amino acid sequences of bacterial proteins does 
not immediately reveal which protein will be essential for viability of the bacterium, 
and target validation is thus a major issue. The second problem is one of redundancy, 
as several biochemical pathways are either structurally duplicated in bacteria 
(different isoforms of the same enzyme), or functionally duplicated by the presence of 

35 salvage pathways in the event of a metabolic block in one pathway (different 

nutritional conditions). The third is that even a valid target may not be structurally or 
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functionally amenable to inhibition by small molecules because of inaccessibility 
(sequestration of target). 

Therefore, there is considerable interest within the pharmaceutical and 
biotechnology industry in identifying key targets for drug discovery amongst the mass 
5 of novel targets generated by large-scale genomic sequencing projects. 

On the other hand, and underscoring the instant invention, the phages herein 
described have, over millions of years, evolved specific mechanisms to target such 
key biochemical pathways and proteins. In the few cases where inhibition by phages 
has been elucidated (e.g., see ref. 3), such bacterial targets are invariably rate-limiting 
10 in their respective biochemical pathways, are not redundant, and/or are readily 
accessible for inhibition by the phage (or by another inhibitory compound). 
Therefore, the sixth step of this invention involves identifying the host biochemical 
pathways and proteins that are targeted by the phage inhibitory mechanisms. 

15 Identifying. Validating, and Characterizing Bacterial Host Target Proteins and 
Affected Pathways 

A rationale for this step is that the inhibitor ORF product from the phage 
physically interacts with and/or modifies certain microbial host components to block 
their function. Exemplary approaches which can be used to identify the host bacterial 

20 pathways and proteins that interact with, and preferably also are inhibited by, phage 
ORF product(s) are described below. 

One approach is a genetic screen to determine physiological proteimprotein 
interaction, for example, using a yeast two hybrid system. In this assay, the phage 
ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain II (amino 

25 acids 768-881) to create a bait vector. A cDNA library of cloned S. aureus sequences 
which have been engineered into a plasmid where the S. aureus sequences are fused to 
the DNA binding domain of Gal4 is also generated. These plasmids are introduced 
alone, or in combination, into yeast strain Y190 - previously engineered with 
chromosomally integrated copies of the E. coli lacZ and the selectable HIS3 genes, 

30 both under Gal4 regulation (Durfee, T., Becherer, K., Chen, P.-L., Yeh, S.-H., Yang, 
Y., Kilbum, A.E., Lee, W.-H., and Elledge, SJ. (1993). Genes & Dev. 7, 555-569). If 
the two proteins expressed in yeast interact, the resulting complex will activate 
transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, 
each driven by a promoter containing Gal4 binding sites, have been integrated into the 

35 genome of the host yeast system used for measuring protein-protein interactions. Such 
a system provides a physiological environment in which to detect potential protein 
interactions. This system has been extensively used to identify novel protein-protein 
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interaction partners and to map the sites required for interaction (for example, to 
identify interacting partners of translation factors (Qiu, H., Garcia-Barrio, M.T., and 
Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1), transcription factors 
(Katagiri, T., Saito, H., Shinohara, A., Ogawa, H., Kamada, N., Nakamura ,Y., and 
5 Miki, Y. (1998). Genes, Chromosomes & Cancer 21, 217-222), and proteins involved 
in signal transduction (Endo, T.A., Masuhara, M., Yokouchi, ML, Suzuki, R., 
Sakamoto, H., Mitsui, K., Matsumoto, A., Tanimura, S., Ohtsubo, M., Misawa, H., 
Miyazaki, T., Leonor N., Taniguchi, T., Fujita, T., Kanakura, Y., Komiya, S., and 
Yoshimura, A. Nature. 387, 921-924). This approach has also been used in many 
10 published reports to identify interaction between mammalian viral and mammalian 
cell proteins. 

For example, the non-structural protein NS1 of parvovirus is essential for viral 
DNA amplification and gene expression and is also the major cytopathic effector of 
these viruses. A yeast two-hybrid screen with NS1 identified a novel cellular protein 

15 of unknown function that interacts with NS-1 , called SGT, for small glutamine-rich 
tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R. 
Grewenig A. Rommelaere, J, and Jauniaux JC. (1998) J Virol 72, 4149-4156). In 
another screen, the adenovirus E3 protein was recently shown to interact with a novel 
tumor necrosis factor alpha-inducible protein and to modulate some of the activities of 

20 E3 (Li Y. KangJ. and Horwitz M.S. (1998). Mol & Cell Biol 18, 1601-1610). In yet 
another recent screen, the herpes simplex virus 1 alpha regulatory protein ICPO was 
found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y. 
Van Sant C. and Roizman B. (1997). J Virol 71,7328-7336). 

Another two-hybrid system for identifying proteimprotein interactions is 

25 commercially available from STRATEGENE™ as the CYTO-TRAP™ system 
(Chang et al., Strategies Newsletter 1 1(3), 65-68 (1998)(from Stratagene)). The 
system is a yeast-based method for detecting proteimprotein interactions in viva, using 
activation of the Ras signal transduction cascade by localizing a signal pathway 
component, human Sos (hSos), to its activation site in the yeast plasma membrane. 

30 The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain 
cdc25H, which contains a point mutation at amino acid residue 1328 of the cdc25 
gene. This gene encodes a guanyl nucleotide exchange factor which binds and 
activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host 
growth at 37°C, but at a permissive temperature of 25°C, growth is normal. The 

35 system utilizes the ability of (hSos) to complement the cdc25 defect and activate the 
yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma 
membrane, the cdc25H yeast strain grows at 37°C. Localizing hSos to the plasma 
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membrane occurs through a proteimprotein interaction. A protein of interest, or bait, 
is expressed as a fusion protein with hSos. The library, or target proteins are 
expressed with the myristylation membrane-localization signal. The yeast cells are 
then incubated under restrictive conditions (37°C). If the bait and the target protein 

5 interact, the hSos protein is recruited to the membrane, activating the Ras signaling 
pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature. 

The protein targets of phage inhibitory ORFs can also be identified using 
bacterial genetic screens. One approach involves the overexpression of a phage 
inhibitory protein in mutagenized bacterial host species, followed by plating the cells 

1 0 and searching for colonies that can survive the antimicrobial activity of the inhibitory 
ORF. These colonies are then grown, their DNA extracted, and cloned into an 
expression vector that contains a replicon of a different incompatibility group from 
the plasmid expressing the original ORF. This library is then introduced into a wild- 
type host bacterium in conjunction with an. expression vector driving synthesis of the 

1 5 phage ORF, followed by selection for surviving bacteria; Thus, bacterial DNA 

fragments from the survivors presumably contain a DNA fragment from the original 
mutagenized host bacterial genome that can protect the cell from the antimicrobial 
activity of the inhibitory phage ORF. This fragment can be sequenced and compared 
with that of the bacterial host to determine in which gene the mutation lies. This 

20 approach enables one to determine the targets and pathways that are affected by the 
killing function. 

A second approach is based on identifying protein :protein interactions 
between the phage ORF product and bacterial S. aureus, e.g., proteins using a 
biochemical approach based, for example, on affinity chromatography. This approach 

25 has been used, for example, to identify interactions between lambda phage proteins 
and proteins from their E. coli host (Sopta, M, Carthew, R.W., and Greenblatt, J. 
(1985) J. Biol. Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag 
(e.g. glutathione-S-transferase ("GST"), 6xHIS, ("HIS") and/or calmodulin binding 
protein ("CPB")) within a commercially available plasmid vector that directs high 

30 level expression on induction of a suitably responsive promoter driving the fusion's 
expression. The translated fusion protein is expressed in E. coli, purified, and 
immobilized on a solid phase matrix via, for example the tag. Total cell extracts from 
the host bacterium, e.g., S. aureus, are then passed through the affinity matrix 
containing the immobilized phage ORF fusion protein; host proteins retained on the 

35 column are then eluted under different conditions of ionic strength, pH, detergents 
etc., and characterized by gel electrophoresis and other techniques. Appropriate 
controls are run to guard against nonspecific binding to the resin. Target proteins thus 
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recovered should be enriched for the phage protein/peptide of interest and are 
subsequently electrophoretically or otherwise separated, purified, sequenced, or 
biochemically analyzed. Usually sequencing entails individual digestion of the 
proteins to completion with a protease (e.g. -trypsin), followed by molecular mass and 
amino acid composition and sequence determination using, for example, mass 
spectrometry, e.g., by MALDI-TOF technology (Qin, J., Fenyo, D., Zhao, Y., Hall, 
W.W., Chao, D.M., Wilson, C.J., Young, R.A. and Chait, B.T. (1997). Anal. Chem. 
69, 3995-4001). 

The sequence of the individual peptides from a single protein are then 
analyzed by the bioinformatics approach described above to identify the S. aureus 
protein interacting with the phage ORF. This analysis is performed by a computer 
search of the S. aureus genome for an identified sequence. Alternatively, all tryptic 
peptide fragments of the S. aureus genome can .be predicted by computer software, 
and the molecular mass of such fragments compared to the molecular mass of the 
peptides obtained from each interacting protein eluted from the affinity matrix. The 
responsible gene sequence can be obtained, for example by using synthetic degenerate 
nucleic acid sequences to pull out the corresponding homologous bacterial sequence. 
Alternatively, antibodies can be generated against the peptide and used to isolate 
nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse 
transcribed, cloned, and further characterized using the procedures discussed herein. 

A variety of other binding assay methods are known in the art and can be used 
to identify interactions between phage proteins and bacterial proteins or other bacterial 
cell components. Such methods that allow or provide identification of the bacterial 
component can be used in this invention for identifying putative targets. 

Validation of the interaction between the phage ORF product and the bacterial 
proteins or other components can be obtained by a second independent assay {e.g., 
co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, H., Garcia- 
Barrio, M.T., and Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1; 
Brown, S. and Blumenthal, T. (1976). Proc. Natl. Acad. Sci. USA 73, 1131-1135)). 

Finally, the essential nature of the identified bacterial proteins is preferably 
determined genetically by creating a constitutive or inducible partial or complete loss- 
of-function mutation in the gene encoding the identified interacting bacterial protein. 
This mutant is then tested for bacterial survival and replication. 

The protein target of the phage inhibitor function can also be identified using a 
genetic approach. Two exemplary approaches will be delineated here. The first 
approach involves the overexpression of a predetermined phage inhibitor protein in 
mutagenized host bacteria, e.g., S. aureus, followed by plating the cells and searching 
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for colonies that can survive the inhibitor. These colonies will then be grown, their 
DNA extracted and cloned into an expression vector that contains a replicon of a 
different incompatibility group, and preferably having a different selectible marker 
than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the 
5 mutant that can protect the cell from phage ORF inhibition can be sequenced and 
compared with that of the bacterial host to determine in which gene the mutation lies. 
This approach allows rapid determination of the targets and pathways that are affected 
by the inhibitor. 

Alternatively, the bacterial targets can be determined in the absence of 

1 0 selecting for mutations using an approach known as "multicopy suppression". In this 
approach, the DNA from the wild type host is cloned into an expression vector that 
can coexist, as previously described, with one containing a predetermined phage 
inhibitor. Those plasmids that contain host DNA fragments and genes that protect the 
host from the phage inhibitor can then be isolated and sequenced to identify putative 

15 targets and pathways in the host bacteria. 

Regardless of the specific mode of identification, screening assays may 
additionally utilize gene fusions to specific "reporter genes" to identify a bacterial 
gene(s) whose expression is affected when the host target pathway is affected by the 
phage inhibitor. Such gene fusions can be used to search a number of small molecule 

20 compounds for inhibitors that may affect this pathway and thus cause cell inhibition. 
This approach will allow the screening of a large number of molecules on petri dishes 
or 96-well format by monitoring for a simple color change in the bacterial colonies, 
In this manner, we can validate host targets and classes of compounds for further 
study and clinical development. These inhibitors also represent lead compounds for 

25 the development of other antibiotics. 

Bioinformatics and comparative genomics are preferably then applied to the 
identified bacterial gene products to predict biochemical function. The biochemical 
activity of the protein can be verified in vitro in cell free assays or in vivo in intact 
cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are 

30 established as a basis for the screening and development of inhibitors. 

These inhibitors, preferably small molecule inhibitors, may comprise peptides, 
antibodies, products from natural sources such as fungal or plant extracts or small 
molecule organic compounds. In general, small molecule organic compounds are 
preferred. These compounds may, for example, be identified within large compound 

35 libraries, including combinatorial libraries. For example, a plurality of compounds, 
preferably a large number of compounds can be screened to determine whether any of 
the compounds binds or otherwise disrupts or inhibits the identified bacterial target. 
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Compounds identified as having any of these activities can then be evaluated further 
in cell culture and/or animal model systems to determine the pharmacological 
properties of the compound, including the specific anti-microbial ability of the 
compound. 

5 For mixtures of natural products, including crude preparations, once a 

preparation or fraction of a preparation is shown the have an anti-microbial activity, 
the active substance can be isolated and identified using techniques well known in the 
art, if the compound is not already available in a purified form. 

Identified compounds possessing anti-microbial activity and similar 
10 compounds having structural similarity can be further evaluated and, if necessary, 
derivatized according to synthesis and/or modification methods available in the art 
selected as appropriate for the particular starting molecule. 

Derivatization of identified anti-microbials 

15 In cases where the identified anti-microbials above might represent peptidal 

compunds, the in vivo effectiveness of such compounds may be advantageously 
enhanced by chemical modification using the natural polypeptide as a starting point 
and incorporating changes that provide advantages for use, for example, increased 
stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, 

20 and/or improved delivery characteristics. 

In addition to active modifications and derivative creations, it can also be 
useful to provide inactive modifications or derivatives for use as negative controls or 
introduction of immunologic tolerance. For example, a biologically inactive 
derivative which has essentially the same epitopes as the corresponding natural 

25 antimicrobial can be used to induce immunological tolerance in a patient being 

treated. The induction of tolerance can then allow uninterrupted treatment with the 
active anti-microbial to continue for a significantly longer period of time. 

Modified anti-microbial polypeptides and derivatives can be produced using a 
number of different types of modifications to the amino acid chain. Many such 

30 methods are known to those skilled in the art. The changes can include, for example, 
reduction of the size of the molecule, and/or the modification of the amino acid 
sequence of the molecule. In addition, a variety of different chemical modifications of 
the naturally occurring polypeptide can be used, either with or without modifications 
to the amino acid sequence or size of the molecule. Such chemical modifications can, 

35 for example, include the incorporation of modified or non-natural amino acids or non- 
amino acid moieties during synthesis of the peptide chain, or the post-synthesis 
modification of incorporated chain moieties. 
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The oligopeptides of this invention can be synthesized chemically or through 
an appropriate gene expression system. Synthetic peptides can include both naturally 
occurring amino acids and laboratory synthesized, modified amino acids. 

Also provided herein are functional derivatives of anti-microbial proteins or 

5 polypeptides. By "functional derivative' 1 is meant a "chemical derivative," 

"fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which 
terms are defined below. A functional derivative retains at least a portion of the 
function of the protein, for example reactivity with a specific antibody, enzymatic 
activity or binding activity. 

10 A "chemical derivative" of the complex contains additional chemical moieties 

not normally a part of the protein or peptide. Such moieties may improve the 
mplecule's solubility, absorption, biological half-life, and the like. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, and the like. Moieties capable of mediating 

1 5 such effects are disclosed in Alfonso and Gennaro (1995). Procedures for coupling 
such moieties to a molecule are well known in the art. Covalent modifications of the 
protein or peptides are included within the scope of this invention. Such 
modifications may be introduced into the molecule by reacting targeted amino acid 
residues of the peptide with an organic derivatizing agent that is capable of reacting 

20 with selected side chains or terminal residues, as described below. 

Cysteinyl residues most commonly are reacted with alpha-haloacetates (and 
corresponding amines), such as chloroacetic acid or chloroacetamide, to give 
carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are 
derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, N- 

25 alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide„p-chloro- 
mercuribenzoate, 2-chloromercuri-4-mtrophenol, or chloro-7-nitrobenzo-2-oxa-l,3- 
diazole. 

Histidyl residues are derivatized by reaction with diethylprocarbonate at pH 
5.5-7.0 because this agent is relatively specific for the histidyl side chain. Para- 
30 bromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M 
sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with succinic or other 
carboxylic acid anhydrides. Derivatization with these agents has the effect of 
reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing 
35 primary amine- containing residues include imidoesters such as methyl 
picolinimidate; pyridoxal phosphate; pyridoxal; chloroborohydride; 
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trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transaminase- 
catalyzed reaction with glyoxylate. 

Arginyl residues are modified by reaction with one or several conventional 
reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and 
5 ninhydrin. Derivatization of arginine residues requires that the reaction be performed 
in alkaline conditions because of the high pK^ of the guanidine fimctional group. 
Furthermore, these reagents may react with the groups of lysine as well as the arginine 
alpha-amino group. 

Tyrosyl residues are well-known targets of modification for introduction of 

10 spectral labels by reaction with aromatic diazonium compounds or tetranitromethane. 
Most commonly, N-acetylimidizol and tetranitromethane are used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

Carboxyl side groups (aspartyl or glutamyl) are selectively modified by 
reaction carbodiimide (R'-N-C-N-R 1 ) such as l-cyclohexyl-3-(2-moipholinyl(4-ethyl) 

15 carbodiimide or l-ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide. Furthermore, 
aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues 
by reaction with ammonium ions. 

Glutaminyl and asparaginyl residues are frequently deamidated to the 
corresponding glutamyl and aspartyl residues. Alternatively, these residues are 

20 deamidated under mildly acidic conditions. Either form of these residues falls within 
the scope of this invention. 

Derivatization with Afunctional agents is useful, for example, for cross- 
linking component peptides to each other or the complex to a water- insoluble support 
matrix or to other macromolecular carriers. Commonly used cross-linking agents 

25 include, for example, 1,1 -bis (diazoacetyl)-2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobi- 
functional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), and Afunctional maleimides such as bis-N- 
maIeimido-l,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) 

30 dithiolpropioimidate yield photoactivatable intermediates that are capable of forming 
crosslinks in the presence of light. Alternatively, reactive water- insoluble matrices 
such as cyanogen bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 

35 Other modifications include hydroxylation of proline and lysine, 

phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the 
alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., 
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Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 
pp. 79-86 (1983)), acetylation of the N-terminal amine, and, in some instances, 
amidation of the C-terminal carboxyl groups. 

Such derivatized moieties may improve the stability, solubility, absorption, 
biological half life, and the like. The moieties may alternatively eliminate or attenuate 
any undesirable side effect of the protein complex. Moieties capable of mediating 
such effects are disclosed, for example, in Alfonso and Gennaro (1995). 

The term "fragment" is used to indicate a polypeptide derived from the amino 
acid sequence of the protein or polypeptide having a length less than the full-length 
polypeptide from which it has been derived. Such a fragment may, for example, be 
produced by proteolytic cleavage of the full-length protein. Preferably, the fragment 
is obtained recombinantly by appropriately modifying the DNA sequence encoding 
the proteins to delete one or more amino acids at one or more sites of the C-terminus, 
N-terminus, and/or within the native sequence. 

Another functional derivative intended to be within the scope of the present 
invention is a "variant" polypeptide that either lacks one or more amino acids or 
contains additional or substituted amino acids relative to the native polypeptide. The 
variant may be derived from a naturally occurring polypeptide by appropriately 
modifying the protein DNA coding sequence to add, remove, and/or to modify codons 
for one or more amino acids at one or more sites of the C-terminus, N-terminus, 
and/or within the native sequence. 

A functional derivative of a protein or polypeptide with deleted, inserted 
and/or substituted amino acid residues may be prepared using standard techniques 
well-known to those of ordinary skill in the art. For example, the modified 
components of the functional derivatives may be produced using site-directed 
mutagenesis techniques (as exemplified by Adelman et al., 1983, DNA 2:183; 
Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified 
such that a modified coding sequence is produced, and thereafter expressing this 
recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as 
those described above. Alternatively, components of functional derivatives of 
complexes with amino acid deletions, insertions and/or substitutions may be 
conveniently prepared by direct chemical synthesis, using methods well-known in the 
art. 

Insofar as other anti-microbial inhibitor compounds identified by the invention 
described herein may not be peptidal in nature, other chemical techniques exist to 
allow their suitable modification, as well, and according the desirable principles 
discussed above. 
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Administration and Pharmaceutical Compositions 

For the therapeutic and prophylactic treatment of infection, the preferred 
method of preparation or administration of anti-microbial compounds will generally 
5 vary depending on the precise identity and nature of the anti-microbial being 

delivered. Thus, those skilled in the art will understand that administration methods 
known in the art will also be appropriate for the compounds of this invention. 

The particularly desired anti-microbial can be administered to a patient either 
by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or 

10 excipient(s). In treating an infection, a therapeutically effective amount of an agent or 
agents is administered. A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of one or more symptoms of bacterial infection 
and/or a prolongation of patient survival or patient comfort. 

Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be 

15 determined by standard pharmaceutical procedures in cell cultures and/or 

experimental organisms such as animals, e.g., for determining the LD 50 (the dose 
lethal to 50% of the population) and the ED 50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio LD 50 /ED 50 . Compounds that 

20 exhibit large therapeutic indices are preferred. The data obtained from these cell 

culture assays and animal studies can be used in formulating a range of dosage for use 
in humans. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED 50 with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route 

25 of administration utilized. 

For any compound identified and used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. Such 
information can be used to more accurately determine useful doses in organisms such 
as plants and animals, preferably mammals, and most preferably humans. Levels in 

30 plasma may be measured, for example, by HPLC or other means appropriate for 
detection of the particular compound. 

The exact formulation, route of administration and dosage can be chosen by 
the individual physician in view of the patient's condition (see e.g. Fingl et. al. f in The 
Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.l). 

35 It should be noted that the attending physician would know how and when to 

terminate, interrupt, or adjust administration due to toxicity, organ dysfunction, or 
other systemic malady. Conversely, the attending physician would also know to 
adjust treatment to higher levels if the clinical response were not adequate (precluding 
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toxicity). The magnitude of an administered dose in the management of the disorder 
of interest will vary with the severity of the condition to be treated and the route of 
administration. The severity of the condition may, for example, be evaluated, in part, 
by standard prognostic evaluation methods. Further, the dose and perhaps dose 
frequency, will also vary according to the age, body weight, and response of the 
individual patient. A program comparable to that discussed above also may be used 
in veterinary or phyto medicine. 

Depending on the specific infection target being treated and the method 
selected, such agents may be formulated and administered systemically or locally, i.e., 
topically. Techniques for formulation and administration may be found in Alfonso 
and Gennaro (1995). Suitable routes may include , for example, oral, rectal, 
transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, 
subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or 
intraperitoneal injections. 

For injection, the agents of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hanks' solution, 
Ringer's solution, or physiological saline buffer. For transmucosal administration, 
penetrants appropriate to the barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

Use of pharmaceutical^ acceptable carriers to formulate identified anti- 
microbials of the present invention into dosages suitable for systemic administration is 
within the scope of the invention. With proper choice of carrier and suitable 
manufacturing practice, the compositions of the present invention, in particular those 
formulated as solutions, may be administered parenterally, such as by intravenous 
injection. Appropriate compounds can be formulated readily using pharmaceutical^ 
acceptable carriers well known in the art into dosages suitable for oral administration. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by 
a patient to be treated. 

Agents intended to be administered intracellular^ may be administered using 
techniques well known to those of ordinary skill in the art. For example, such agents 
may be encapsulated into liposomes, then administered as described above. 
Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present 
in an aqueous solution at the time of liposome formation are incorporated into the 
aqueous interior. The liposomal contents are both protected from the external 
microenvironment and, because liposomes fuse with cell membranes, are efficiently 
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delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small 
organic molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
achieve the intended purpose. Determination of the effective amounts is well within 
the capability of those skilled in the art. 

In addition to the active ingredients, these pharmaceutical compositions may 
contain suitable pharmaceutical ly acceptable carriers comprising excipients and 
auxiliaries which facilitate processing of the active compounds into preparations 
which can be used pharmaceutical^. The preparations formulated for oral 
administration may be in the form of tablets, dragees, capsules, or solutions, including 
those formulated for delayed release or only to be released when the pharmaceutical 
reaches the small or large intestine. 

The pharmaceutical compositions of the present invention may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, 
entrapping or lyophilizing processes. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active anti-microbial compounds in water-soluble form. 
Alternatively, suspensions of the active compounds may be prepared as appropriate 
oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, 
or liposomes. Aqueous injection suspensions may contain substances which increase 
the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of highly 
concentrated solutions. 

Pharmaceutical preparations for oral use can be obtained by combining the 
active compounds with solid excipient, optionally grinding a resulting mixture, and 
processing the mixture of granules, after adding suitable auxiliaries, if desired, to 
obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as 
sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such 
as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum 
tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, 
disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, 
agar, or alginic acid or a salt thereof such as sodium alginate. 
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Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium 
dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
5 Dyestuffs or pigments may be added to the tablets or dragee coatings for identification 
or to characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit 
capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 

10 ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be. added. 

15 The above methodologies may be employed either actively or prophylactically 

against an infection of interest. 

Computer-related Aspects and Embodiments 

In addition to the provision of compounds as chemical entities, nucleotide 

20 sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably 
at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences 
can also be provided in a variety of additional media to facilitate various uses. 

Thus, as used in this section, "provided" refers to an article of manufacture, 
rather than an actual nucleic acid molecule, which contains a nucleotide sequence of 

25 the present invention; e.g., a nucleotide sequence of an exemplary bacteriophage or a 
sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide 
sequence at least 95%, more preferably at least 99% and most preferably at least 
99.9% identical to such a bacteriophage or bacterial sequence, for example, to a 
polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 

30 77 (S. aureus host) or bacteriophage 3 A (S.aureus host) or bacteriophage 96 (S. 

aureus host). Such an article provides a large portion of the particular bacteriophage 
genome or bacterial gene and parts thereof (e.g., a bacteriophage open reading frame 
(ORF)) in a form which allows a skilled artisan to examine and/or analyze the 
sequence using means not directly applicable to examining the actual genome or gene 

35 or subset thereof as it exists in nature or in purified form as a chemical entity. 

In one application of this aspect, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
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readable media" refers to any medium that can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such 
as floppy discs, hard disc storage medium, magnetic tape; optical storage media such 
as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
5 categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used 
to create an article of manufacture which includes one or more computer readable 
media having recorded thereon a nucleotide sequence or sequences of the present 
invention. Likewise, it will be clear to those of skill how additional computer 

10 readable media that may be developed also can be used to create analogous 

manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
known methods for recording information on computer readable medium to generate 

15 manufactures comprising the nucleotide sequence information of the present 
invention. 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon a nucleotide sequence 
of the present invention. The choice of the data storage structure will generally be 

20 based on the means chosen to access the stored information. In addition, a variety of 
data processor programs and formats can be used to store the nucleotide sequence 
information of the present invention on computer readable medium. The sequence 
information can, for example, be presented in a word processing test file, formatted in 
commercially available software such as WordPerfect and Microsoft Word, or 

25 represented in the form of an ASCII file, stored in a database application, such as 
DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data processor structuring formats (e.g., text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence 
information of the present invention. 

30 Computer software is publicly available which allows a skilled artisan to 

access sequence information provided in a computer readable medium. Thus, by 
providing in computer readable form a nucleotide sequence of an unsequenced 
bacteriophage, such as an exemplary bacteriophage listed in Table 1 or of a sequence 
encoding a bacterial target or a fragment thereof, preferably a nucleotide sequence at 

35 least 95%, more preferably at least 99% and most preferably at least 99.9% identical 
to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of 
bacteriophage 77 (£ aureus host) or bacteriophage 3 A {S.aureus host) bacteriophage 



96 (S. aureus host), bacteriophage 44AHJD (S. aureus host), bacteriophage Dp-1 
(Streptococcus pneumoniae host), or bacteriophage 182 (Enterococcus host) the 
present invention enables the skilled artisan to routinely access the provided sequence 
information for a wide variety of purposes. 
5 Those skilled in the art understand that software can implement a variety of 

different search or analysis software which implement sequence search and analysis 
algorithms, e.g., the BLAST (Altschul et al., J. Mol. Biol. 215:403410 (1990) and 
BLAZE (Brutlag et al., Comp. Chem 17:203-207 (1993)) search algorithms. For 
example, such search algorithms can be implemented on a Sybase system and used to 
10 identify open reading frames (ORFs) within the bacteriophage genome which contain 
homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other 
organisms, e.g., the host bacterium. Among the ORFs discussed herein are protein 
encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting 
proteins or fragments. 

1 5 The present invention further provides systems, particularly computer-based 

systems, which contain the sequence information described. Such systems are 
designed to identify, among other things, useful fragments of the bacteriophage 
genomes. 

As used herein, "a computer-based system" refers to the hardware, software, 

20 and data storage media used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware of the computer-based systems of the 
present invention comprises a central processing unit (CPU), input device, output 
device, and data storage medium or media. A skilled artisan will readily recognize 
that any of the currently available general purpose computer-based system are suitable 

25 for use in the present invention, as well as a variety of different specialized or 
dedicated computer-based systems. 

As stated above, the computer-based systems of the present invention 
comprise data storage media having stored therein a nucleotide sequence of the 
present invention and the necessary hardware and software for supporting and 

30 implementing a search and/or analysis program. 

As used herein, "data storage media" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide sequence 
information of the present invention. 

35 As used herein, "search program" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 
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Search means are used to identify fragments or regions of the present gnomic 
sequences which match a particular target sequence or target motif. A variety of 
known algorithms are disclosed publicly and a variety of commercially available 
software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not 
limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan 
can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches and/or sequence analyses can be 
adapted for use in the present computer-based systems. 

As used herein in connection with sequence searches and analyses, a "target 
sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target 
sequence is, the less likely a target sequence will be present as a random occurrence in 
the database. Also, the target sequence length is preferably selected to include 
sequence corresponding to a biologically relevant portion of an encoded product, for 
example a region which is expected to be conserved across a range of source 
organisms. Preferably the sequence length of a target polypeptide sequence is from 5- 
100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 
10-80 or 10-100 amino acids. Preferably the sequence length of a target 
polynucleotide sequence is from 15-300 nucleotide residues, more preferably from 21- 
240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues. 
However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may 
be of shorter length. Likewise, it may be desirable to search and/or analyze longer 
sequences. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding 
of the target motif. There are a variety of target motifs known in the art. Protein 
target motifs include, but are not limited to, enzymatic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to promoter 
sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

A variety of structural formats for the input and output devices can be used to 
input and output the information in the computer-based systems of the present 
invention. A preferred format for an output device ranks fragments of the 
bacteriophage or bacterial sequences possessing varying degrees of homology to the 
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target sequence or target motif. Such presentation provides a skilled artisan with a 
ranking of sequences which contain various amounts of the target sequence or target 
motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing methods and/or devices and/or formats can be used to 
5 compare a target sequence or target motif with the sequence stored in data storage 
media to identify sequence fragments of the bacteriophage or bacterium in question. 
One skilled in the art can readily recognize that any one of the publicly available 
homology search programs can be used as the search program for the computer-based 
systems of the present invention. Of course, suitable proprietary systems that may be 

10 known to those of skill, or later developed, also may be employed in this regard. 

Figure 6 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 includes a 
processor 106 connected to a bus 104. Also connected to the bus 104 are a main 
memory 108 (preferably implemented as random access memory, RAM) and a variety 

15 of secondary storage devices 1 10, such as a hard drive 1 12 and a removable medium 
storage device 1 14. The removable medium storage device 1 14 may represent, for 
example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A 
removable storage medium 1 16 (such as a floppy disk, a compact disk, a magnetic 
tape, etc.) containing control logic and/or data recorded therein may be inserted into 

20 the removable medium storage device 1 14. The computer system 102 includes 

appropriate software for reading the control logic and/or the data from the removable 
medium storage device 1 14, once it is inserted into the removable medium storage 
device 114. 

A nucleotide sequence of the present invention may be stored in a well-known 
25 manner in the main memory 108, any of the secondary storage devices 110, and/or a 
removable storage medium 1 16. During execution, software for accessing and 
processing the sequence (such as search tools, comparing tools, etc.) reside in main 
memory 108, in accordance with the requirements and operating parameters of the 
operating system, the hardware system and the software program or programs. 
30 The data storage medium in which the sequence is embodied and the central 

processor need not be part of a single stand-alone computer, but may be separated so 
long as data transfer can occur. For example, the processor or processors being 
utilized for a search or analysis can be part of one general purpose computer, and the 
data storage medium can be part of a second general purpose computer connected to a 
35 network, or the data storage medium can be part of a network server. As another 
example the data storage medium can be part of a computer system or network 
accessible over telephone lines or other remote connection method. 
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EXAMPLES 



Example 1. Growth of Staph A bacteriophage 77 and purificat ion of genomic 
DNA. 

5 The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was 

used as a host to propagate its respective phage 77 (ATCC # 27699-B1). Two rounds 
of plaque purification of phage 77 were performed on soft agar essentially as 
described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 
37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco 

10 Laboratories) and 0.5% NaCl (w/v)].The culture was then diluted 20x in NB and 
incubated at 37°C until the OD 540 = .2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using 
phage buffer ( 1 mM MgS0 4 , 5 raM MgCl 2 , 80 mM NaCl and 0. 1 % Gelatin (w/v)) and 
10 |il of each dilution was used to infect 0.5 ml of the cell suspension in the presence 

1 5 of 400 |ig/ml CaCl 2 . After incubation of 1 5 min at room temperature (RT), 2 ml of 
melted soft agar kept at 45°C (NB supplemented with 0.6% agar) was added to the 
mixture and poured onto the surface of 100 mm nutrient agar plates (0.3% Bacto Beef 
extract, 0.5% Bacto peptone, 0.5% NaCl and 1.5% Bacto agar (w/v)). After overnight 
incubation at 30°C, a single plaque was isolated, resuspended in 1 ml of phage buffer 

20 by end over end rotation for 2 hrs at 20°C, and the phage suspension was diluted and 
used for a second infection as described above. After overnight incubation at 30°C, a 
single plaque was isolated and used as a stock. 

The propagation procedure for bacteriophage 77 was modified from the agar 
layer method of Swanstorm and Adams (1951). Briefly, the PS 77 strain was grown to 

25 stationary phase overnight at 37°C in Nutrient broth. The culture was then diluted 
twenty- fold in NB and incubated at 37°C until the OD 540 = .2. The suspension (15xl0 7 
Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu) to give a ratio of 
100-bacteria/phage particle in the presence of 400 ng/ml of CaCl 2 . After incubation - 
for 15 min at 20°C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the 

30 mixture and poured onto the surface of 1 50 mm nutrient agar plates and incubated 1 6 
hrs at 30°C. To collect the phage plate lysate, 20 ml of NB were added to each plate 
and the soft agar layer was collected by scrapping off with a clean microscope slide 
followed by shaking of the agar suspension for 5 min to break up the agar. The 
mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor 

35 (Beckman) and the supernatant fluid (lysate) was collected and subjected to a 

treatment with 10 \ig /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) PEG 8000 and 
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0.5 M of NaCl followed by incubation at 4°C for 16 his. The phage was recovered by 
centrifiigation at 4,000 rpm (3,500xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM 
MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was 
5 extracted with 1 volume of chloroform and further purified by centrifiigation on a 
cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor centrifuged in an Optima TLX ultracentrifiige (Beckman) for 2 h at 28,000 rpm 
(67,000xg) at 4°C. Banded phage was collected and ultracentrifiiged again on an 
isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000xg) for 24 h at 

10 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 

15 phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA). 

Example 2. DNA sequencine of Bacteriophage 77 genome 

Four micrograms of phage 77 DNA was diluted in 200 jil of TE (10 mM Tris, 
20 [pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator™, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 |im with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
25 as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 

agarose gel and purified using a commercial DNA extraction system according to the 
instructions of the manufacturer (Qiagen), with a final elution of 50 of 1 mM Tris 
(pH8.5). 

The ends of the sonicated DNA fragments were repaired with a combination of 
30 T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ^1) 
containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ^g/ml BSA, 100 ^iM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
35 units of Klenow large fragment (New England Biolabs) for 15 min at room 

temperature. The reaction was stopped by two phenol/chloroform extractions and the 
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DNA was precipitated with ethanol and the final DNA pellet was resuspended in 20 
plofH 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Kmc II 
site of pKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf 
intestinal alkaline phosphatase (New England Biolabs)-treated pKS 11+ vector 
(Stratagene). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 ul of 
repaired sonicated phage DNA (50-100 ng) in a final volume of 20 ul containing 800 
units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C 
Transformation and selection of bacterial clones containing recombinant plasmids was 
performed in E. coli DH10P according to standard procedures (Sambrook et al., 
1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 p.1 LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS 11+ vector. PCR amplification of foreign 
insert was performed in a 15 pi reaction volume containing 10 mM Tris (pH 8.3), 50 
mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uM each dNTP, and 
0.75 units Tag polymerase (BRL). The thermocycling parameters were as follows: 2 
min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 57°C, and 2 min extension at 72°G, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 
determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer or ABI prism Big Dye™ terminator cycle sequencing 
ready reaction kit (Applied Biosystems). To ensure co-linearity of the sequence data 
and the genome, all regions of phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

Sample 3. Rioinformatir. management n f primary nucleotide sequence from 
Phage 77. 

Phage 77 sequence contigs were assembled using Sequencher™ 3.1 software . 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
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the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BIG DYE™ terminator cycle sequencing ready reaction kit. The complete 
sequence of bacteriophage 77 is shown in Table 2. 

A software program was developed and used on the assembled sequence of 

5 bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF 
identification software can also be utilized, preferably programs which allow 
alternative start codons. The software scans the primary nucleotide sequence starting 
at nucleotide #1 for an appropriate start codon. Three possible selections can be made 
for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or 

10 GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This 
latter initiation codon set corresponds to the one reported by the NCBI 
r http://www nchi.nlm.nih.gov/htbin-DQst/Taxonom Y/wpn"tgc?mode=c'> for the 

bacterial genetic code. 

When an appropriate start codon is encountered, a counting mechanism is 

15 employed to count the number of codons (groups of three nucleotides) between this 
start codon and the next stop codon downstream of it. If a threshold value of 33 is 
reached, or exceeded, then the sequence encompassed by these two codons (start and 
stop codons) is defined as an ORF. This procedure is repeated, each time starting at 
the next nucleotide following the previous stop codon found, in order to identify all 

20 the other putative ORFs. The scan is performed on all three reading frames of both 
DNA strands of the phage sequence. 

Sequence homology (BLAST) searches for each ORF are then carried out 
using an implementation of BLAST programs, although any of a variety of different 
sequence comparison and matching programs can be utilized as known to those 

25 skilled in the art. Downloaded public databases used for sequence analysis include: 

i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.govMast/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

30 v) S. aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes (ftp://ftp.genome.ou.edu/pub/strep/strep-lk.fa); 

vii) Streptococcus pneumoniae 

(ftp://ftp.tigr.Org/pub/data/s_pneumoniae/gsp.contigs.112197.Z); 

viii) Mycobacterium tuberculosis CSU#9 

35 (ftp://ftp.tigr.Org/pub/data/m_tuberculosis/TB_091097.Z) and 

ix) pseudomonas aeruginosa <http://ww.gen ome.washington pdn/pseudo/data.html). 
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The results of the homology searches performed on the ORFs is shown in 
Table 5. 

F.vample 4. Snhclonine of Bacteriophage 77 ORFs into a Stap h A inducible 
expression system. 

The shuttle vector pT002 1 , in which the firefly luciferase (lucFF) expression 
is controlled by the ars (arsenite) promoter/operator (Tauriaineh et al., 1997), was 
modified in the following fashion. Two oligonucleotides corresponding to a short 
antigenic peptide derived from the heamaglutinin protein of influenza virus (HA 
epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence 
(with BamHl, Sail and HindlTL cloning sites) is: 

5'-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3' 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a Hindlll cloning site) is: 

5 '-agctTC AGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 ' 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and HinaHI. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A. 

Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and 
having a Shine-Dalgarno sequence upstream of the initiation codon was selected for 
functional analysis for bacterial inhibition. In total, 98 ORFs were selected and 
screened as detailed below. A list of these is presented in Table 3. Each individual 
ORF, from initiation codon to last codon (excluding the stop codon), was amplified 
from phage genomic DNA using the polymerase chain reaction (PCR). For PCR 
amplification of ORFs, each sense strand primer targets the initiation codon and is 
preceded by z BamHl restriction site ( 5 cgggatcc 3 ) and each antisense oligonucleotide 
targets the pentultimate codon (the one before the stop codon) of the ORF and is 
preceded by a Sal I restriction site fgcgtcgaccg 3 ). The PCR product of each ORF was 
gel purified and digested with BamHl and Sail. The digested PCR product was then 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described 
above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis 
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using primers flanking the cloning site. The names and sequences of the primers that 
were used for the PCR amplification were: HAF: 

S 'TATTATCCAAAACTTGAACA 3 '; HAR: 5 CGGTGGTATATCCAGTGATT y . The 
sequence integrity of cloned ORFs was verified directly by DNA sequencing using 

5 primers HAF and HAR. In cases where verification of ORF sequence could not be 
achieved by one pass with the sequencing primers, additional internal primers were 
selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) was used as a 
recipient for the expression of recombinant plasmids. Electoporation was performed 

10 essentially as previously described (Schenk and Laddaga, 1992). Selection of 

recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing 
30 ug/ml of kanamycin. 

For each ORF introduced in the pTHA plasmid, 3 independent transformants 
were isolated and used to individually inoculate cultures in 5 ml of TSB containing 

15 30ug/ml kanamycin, followed by growth to saturation (16 hrs at 30°C). An aliquot of 
this stationary phase culture was used to generate a frozen glycerol stock of the 
transformant ( stored at - 80°C). The remaining culture was used for plasmid DNA 
extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22°C for 5 
min. The pellet was resuspended in 200 pi 25% sucrose containing 25U/ml of 

20 lysostaphin and incubated for 15 min at 37°C. Then, 400pl of alkaline SDS solution 
(3% SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room 
temperature. After the alkaline SDS treatment, 300pl of ice-cold 3M sodium acetate 
pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room 
temperature. The supernatant was transferred to a new 1.5 ml conical centrifuge tube 

25 and 650pl of isopropanol (stored at room temperature) were added. The mix was then 
centrifuged at 13,000 x g for 5 min. The supernatant fluid was discarded, the pellet 
washed with 70% ethanol, and resuspended in 320 pi sterile distilled water. 

The presence of individual phage 77 ORF DNA inserts in the plasmid was 
verified by PCR amplification using 1.5 pi transformant miniprep DNA in a PCR 

30 with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The 
composition of the PCR reaction and the cycling parameters are identical to those 
employed for library screening described above. 

Example 5. Functional assav for bacterial inhibitory a ctivity of bacteriophage 77 
35 ORFs. 

The anti-microbial activity of individual phage 77 ORFs was monitored by 
two growth inhibitory assays, one on solid agar medium, the other in liquid medium. 
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In general, Staphylococcus bacteria transformed with expression plasmids containing 
individual ORFs were grown in normal TSA medium and stored in 19% glycerol. At 
pre-determined times, arsenite was added to the culture to induce transcription of the 
phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter 

5 in the pTHA expression plasmid. 

The effect of ORF induction on bacterial growth characteristics was then 
monitored and quantitated. The growth inhibition assay on solid medium was 
performed by streaking pTHA/ORF containing S. aureus transformant onto LB-Kn 
and TSA-Kn plates containing increasing concentrations of sodium arsenite (0; 2.5; 5; 

10 and 7.5 uM). Arsenite is used to induce the expression of cloned DNA in pTHA 
vector. In parallel, 3 ul of 1/10 and 1/100 dilutions of the frozen cultures of the 
pTHA/ORF transformants were spotted as single drops onto LB-Kn and TSA-Kn 
plates containing increasing concentration of sodium arsenite (0; 2.5; 5; and 7.5 uM). 
The plates were then incubated 16 hrs at 37°C, and the effect of arsenite-induced ORF 

1 5 expression on bacterial growth was monitored and quantitated by comparing the 
extent to that seen in control plates. As positive controls for growth inhibition.the 
holin/lysin genes of the Sthaphylococcus aureus phage Twort (Loessner et al., 1998) 
was subcloned into the pTHA ars inducible vector and used. 

For the growth inhibition assay in liquid medium, stationary phase cultures 

20 were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 
transformants containing phage 77 ORFs cloned in pTHA vector followed by 
incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same 
medium, and the bacteria were allowed to grow for 2 hrs at 37°C to reach early log 
phase. 150 p.1 of such culture were then mixed with 2.35 ml TSB-Kn medium with or 

25 without arsenite (the final concentration of arsenite in the medium was 0 or 5 uM 
arsenite). After 3.5 hrs incubation at 37°C with shaking at 250 rpm, 100 ul of 
bacterial culture was removed from each tube for OD 565 measurement. Serial ten-fold 
dilutions of the culture in buffered saline solution (0.85% NaCl) were then spotted 
onto TSB-Kn plates. The plates were incubated at 37°C 16 hrs and the number of 

30 surviving colonies counted the following day. The growth inhibitory property of 
individual ORFs was then quantitated by comparing CFU numbers under normal or 
arsenite-induction conditions. A schematic flow "of the inhibition analysis is shown in 
Fig. 3 (also applicable to inhibition analysis for the other phage and bacteria pointed 
out herein). Inhibition results are shown in Figures 4A-C. 

35 

Exam ple 6: Itentification of Cecropin Signature Motif in Staphylococcus aureus 
Rar.terio phage 3A ORF 
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The genome for S. aureus bacteriophage 3 A was determined and the sequence 
was analyzed essentially as described for bacteriophage 77 in the examples above. 
Upon blast analysis of the identified open reading frames of phage 3A, the presence of 
an amino acid sequence corresponding to a cecropin signature motif was observed. 
5 This motif (WDGHKTLEK) is located at position aa 481-489. Cecropins were 
originally identified in proteins from the cecropia moth and are recognized as potent 
antibacterial proteins that constitute an important part of the cell-free immunity of 
insects. Cecropins are small proteins (31-39 amino acid residues) that are active 
against both Gram-positive and Gram-negative bacteria by disrupting the bacterial 
1 0 membranes. Although the mechanisms by which the cecropons cause cell death are 
not fully understood, it is generally thought to involve channel formation and 
membrane destabilization. 

The identification of a motif corresponding to a known inhibitor suggests that 
the product of ORF002 is also an inhibitory compound. Such inhibitory activity can 
1 5 be confirmed as described herein or by other methods known in the art. Confirmation 
of the inhibitory activity would indicate that the ORF product could serve as the basis 
for construction of mimetic compounds and other inhibitors directed to the target of 
the ORF002 product. 

Boman & Hultmark, 1987, Ann. Rev. Microbiol. 41 -.103-126. 
20 Boman, 1991, Cell 65:205-207. 

Boman et al. , 1 99 1 , Eur. J. Bioichem. 201:23-31. 
Wang et al., J. Biol. Chem. 273:27438-27448. 

Example?. Growth of Staphylococcus aureus bacteriophage 44AHJD : 
25 Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference 

Centre #HER 1 101) was used as a host to propagate its respective phage 44AHJD 
(Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of 
phage 44AHJD were performed on soft agar essentially as described in Sambrook et 
al. (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C 
30 in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco 
Laboratories # 0003-17-8), supplemented with 0.5% NaCl]. The culture was then 
diluted 20 fold in NB and incubated at 37°C until an OD 540 of 0.2. In order to obtain 
single plaques, phage 44AHJD was subjected to 10-fold serial dilutions using the 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin) and 10 ul 
35 were used to infect 0.5 ml of the cell suspension in the presence of 400 pg/ml of 
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CaCl 2 . After incubation of 15 min at room temperature, 2 ml of melted soft agar (NB 
supplemented with 0.6% of agar) were added to the mixture and poured onto the 
surface of 100 mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, 
0.5% NaCl and 15 g of Bacto agar per liter (Difco Laboratories # 0001-17-0). After 
5 overnight incubation at 37°C, a single plaque was isolated, resuspended in 1ml of 
phage buffer by end over end rotation for 2 h at room temperature and the phage 
suspension was diluted and used for a second infection as described above. After 
overnight incubation at 37°C, a single plaque was isolated and used as a stock. 

Large scale purification of bacteriophage and preparation of phage DNA was 
10 as follows. 

The propagation method was carried out by using the agar layer method 
described by Swanstorm and Adams (1951). Briefly, the PS 44A strain was grown to 
stationary phase overnight at 37°C in Nutrient Broth. The culture was then diluted 20x 
in NB and incubated at 37°C until the A 540 = 0.2. The suspension (15xl0 7 Bacteria) 

15 was then mixed with 15x10 s phage particles to give a ratio of 100-bacteria/phage 
particle in the presence of 400 ^ig/ml of CaCl 2 . After incubation of 15 min at room 
temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the 
surface of 150 mm nutrient agar plates and incubated overnight at 37°C. To collect the 
lysate, 20 ml of NB were added to each plate and the soft agar layer was collected by 

20 scrapping off with a clean microscope slide and shaken vigorously for 5 min to break 
up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) 
using a JA-10 rotor (Beckman) and the supernatant (lysate) is collected and subjected 
to a treatment with 10 jig/ml of DNase I and RNase A for 30 min at 37°C To 
precipitate the phage particles, 10% (w/v) of PEG 8000 and 0.5 M of NaCl were 

25 added to the lysate and the mixture was incubated on ice for 16 h. The phage was 
recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R 
table top centrifuge (Beckman). 

The pellet was resuspended with 2 ml of phage buffer (1 mM MgS0 4 , 5 mM 
MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was extracted with 1 

30 volume of chloroform and further purified by centrifugation on a preformed cesium 
chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 rotor 
and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 |ig/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome. 

Four mg of phage DNA was diluted in 200 jal of TE pH 8.0 in a 1.5 ml 
eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher 
Scientific). Samples were sonicated under an amplitude of 3 |im with bursts of 5 s 

15 spaced by 15 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1% 
agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis. 
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified 
using a coommercial DNA extraction system according to the instructions of the 
manufacturer (Qiagen) and eluted in 50 |il of lmMTris-HCl [ pH 8.5]. 

20 The ends of the sonicated DNA fragments were repaired with a combination of 

T4 DNA polymearse and the Klenow fragment of E. coli DNA polymerase 1 as 
follows. Reactions were performed in a final volume of 100 jal containing DNA, 10 
mM Tris-HCl pH 8.0, 50 mM NaCl, 10 mM MgCl 2 , 1 mM DTT, 5 \ig BSA, 100 \xU 
of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min 

25 at 12°C followed by addition of 12.5 units of Klenow fragment (New England 
Biolabs) for 15 min at room temperature. The reaction was stopped by two 
phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended 
in20nlofH 2 O. 

Cloning of the sonicated phage DNA into pKSII vector and transformation: 
30 Blunt-ended DNA fragments were cloned by ligation directly into the Hindi 

site of the pkSII vector (Stratagene) dephosphorylated with calf intestinal alkaline 
phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 



to 5 jal of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 ^il 
containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16°C. 
Transformation and selection of positive clones was performed in the host strain 
DH10 p of E. coli using ampicillin as a selective antibiotic as described in Sambrook 
5 etal. (1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ml LB and 100 pig/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hindi cloning site of the pKS vector. PCR amplification of the potential 

10 foreign inserts was performed in a 15 fil reaction volume containing 10 mM Tris-HCl 
(pH 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 mM primer, 187.5 jiM each 
dNTP, and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed 

15 by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 to 2 kbp 
were selected and plasmid DNA was prepared from the selected clones using the 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was determined 
using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism 

20 BigDye™ primer cycle sequencing (21M13 primer: #403055)(M13REV primer: 
#403056) or ABI prism BigDye™ terminator cycle sequencing ready reaction kit 
(Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the 
genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

25 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 9. Bioinformatic management of primary nucleotide sequence. 
30 Sequence contigs were assembled using Sequencher™ 3.1 software 

(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
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prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303 152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD 
is shown in Table 16. 

A software program was used on the assembled sequence of bacteriophage 
5 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the 
primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, 
GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
. 10 to the one reported by the NCBI (http://www.ncbi.nlm.nih.gov/htbin- 

post/Taxonomv/wprintgc?mode=c > ) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
15 then the sequence encompassed by these two codons is defined as an ORF. This 

procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 & 18. 
20 Sequence homology searches for each ORF were carried out using an 

implementation of blast programs. Downloaded public databases used for sequence 
analysis include: 

(i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 
ii) Swissprot (ftp://ncbi.nlm.nih.gOv/blast/db/swissprot.Z); 
25 iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) Staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
lk.fa); 

vi) 5ta/?/iy0C0CC^ 
30 97.Z); 

vii) PRODOM(ftp://ftp.toulouse.inra^ 

ast.gz); 

viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); 
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ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
44AHJD are shown in Tables 19 & 20. 



5 Example 10. Sub-Cloning of Bacteriophage 44 AHJD ORFs. 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is 
inducible. For example, the shuttle vector pT0021, in which the firefly luciferase 
(lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et 
10 al., 1997), can be modified in the following fashion. Two oligonucleotides 

corresponding to a short antigenic peptide derived from the heamaglutinin protein of 
influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense 
strand HA tag sequence (with BamHl, Sail and HindlU cloning sites) is: 
5 -gatcccggtcgacca 

agcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3' 
15 (where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindlU cloning site) is: 

5 , -agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 , 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 

20 digested with BamHl and HindlU. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1A (another 
userful vector construct is shown in Fig. IB). 

25 Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids 

and having a Shine-Dalgarno sequence upstream of the initiation codon can be 
selected for functional analysis for bacterial inhibition. Each individual ORF, from 
initiation codon to last codon (excluding the stop codon), can be amplified from phage 
genomic DNA using the polymerase chain reaction (PCR). For PCR amplification of 

30 ORFs, each sense strand primer targets the initiation codon and is preceded by a 
BamHl restriction site ( 5 'cgggatcc 3 ') and each antisense oligonucleotide targets the 
pentultimate codon (the one before the stop codon) of the ORF and is preceded by a 
Sal I restriction site ( 5 'gcgtcgaccg 3 ). The PCR product of each ORF can be gel 



purified and digested with BamHl and Sail. The digested PCR product can then be 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10|3(as described 
above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 

5 positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR 
analysis using primers flanking the cloning site. The following primers can be used 
for PCR amplification: HAF: S 'TATTATCCAAAACTTGAACA 3 '; HAR: 
5 CGGTGGTATATCCAGTGATT 3 '. The sequence integrity of cloned ORFs can be 

1 0 verified directly by DNA sequencing using primers HAF and HAR. In cases where 
verification of ORF sequence can not be achieved by one pass with the sequencing 
primers, additional internal primers will be selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) will be used as 
a recipient for the expression of recombinant plasmids. Electoporation will be 

15 performed essentially as previously described (Schenk and Laddaga, 1992). Selection 
of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates 
containing 30 p-g/ml of kanamycin. 

Alternatively, a constitutive promoter can be used to drive expression of the 
introduced ORF, and compare cell growth to control bacterial cells containing the 

20 parental vector lacking any introduced phage ORF. Recombinant plasmids will be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
electoporation as previously described (Schenk and Laddaga, 1 992). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 

25 bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), can be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 

30 will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH10. Recombinant clones are then picked and their insert sizes confirmed by 
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PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
5 primers can be selected and used for sequencing..Recombinant plasmids can be 

introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
■ electoporation as previously described (Schenk and Laddaga, 1992). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
10 assessed, for example, in either of the two methods. 

1 . Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of 5. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates 
containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 |iM). The 
15 plates are incubated overnight at 37°C, after which a growth inhibition of the ORF 
transformants on plates that contain arsenite are compared to plates without arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 

20 then diluted to the mid log phase (OD 540 =.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 |il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 jaM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage 44 AHJD ORFs 
on bacterial cell growth is then monitored by measuring the OD 540 and comparing the 

25 rate of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
Maier, SK. and Scherer, S. 1998. F EMS Microbiology Letters #162:265-274) can be 

30 subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
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colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
5 presence of inducer as compared to when grown in the absence of inducer. 
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F.xam ple 11. firowth of Enterococcus bacte riophage 182 and purification of 
genomic DNA . 

The Enterococcus propagating strain (PS) {Enter ococcus sp. Group D, Felix 
d'Herelle Reference Centre #HER 1080) was used as host to propagate its respective 
phage 182 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque 
purification of phage 182 were performed on soft agar essentially as described in 
Sambrook et al. (1989). Briefly, the Enterococcus sp. PS strain was grown overnight 
at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g 
Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter 
(Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and 
incubated at 37°C until the OD S40 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions 
using the phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin 
(w/v)) and 10 1 of each dilution was used to infect 0.5 ml of the bacterial cell 
suspension. After incubation at 15 min at 37°C, 2 ml of melted soft agar (TSB 
supplemented with 0.6% agar) was added to the mixture and poured onto the surface 
of 100 mm Trytic Soy Agar plates [TSA: 15 g Tryptone peptone, 5 g Soytone 
peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- 
17)]. After overnight incubation at 37°C, a single plaque was isolated, resuspended in 
1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the 
phage suspension was diluted and used for a second infection as described above. 
After overnight incubation at 37°C, a single plaque was isolated and used as a stock 
for all subsequent manipulations. 

The propagation procedure for bacteriophage 182 was modified from the agar 
layer method of Swanstorm and Adams (1951). Briefly, the Enterococcus sp. PS 
strain was grown to stationary phase overnight at 37°C in TSB. The culture was then 
diluted 20 fold in TSB and incubated at 37°C until the A 540 = 0.2. The suspension 
(15xl0 7 Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu).to give a 
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ratio of 100-bacteria/pfu. After incubation of 15 min at 37°C, 7.5 ml of melted soft 
agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 
150 mm TSA plates and incubated 16 hrs at 37°C. To collect the plate lysate, 20 ml 
of TSB were added to each plate and the soft agar layer was collected by scrapping off 
with a clean microscope slide followed by vigorous shaking of the agar suspension for 
5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm 
(2,830 xg) using a JA-10 rotor (Beckman) and the supernatant fluid (lysate) is 
collected and subjected to a treatment with 10 ng /ml of DNase I and RNase A for 30 
min at 37°C. To precipitate the phage particles, the phage suspension was adjusted to 
10% (w/v) of PEG 8000 and 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. 
The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C 
on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of 
phage buffer (I mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The 
phage suspension was extracted with 1 volume of chloroform and further purified by 
centrifugation on a cesium chloride step gradient as described in Sambrook et al. 
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge 
(Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4°C. Banded phage was collected 
and ultracentrifuged again on an isopycnic cesium chloride gradient (1.45 g/ml) at 
40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages 
were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis 
buffer consisting of 10 mM NaCl, 50 mM Tris-HCl [pH 8] and 10 mM MgCt 2 . Phage 
DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml 
Proteinase K and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive 
extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of 
chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE (10 mM 
Tris-HCl [pH 8.0], ImM EDTA). 

Example 12. HNA sequencing of the B acteriophage 1 82 genome. 

Four micrograms of phage DNA was diluted in 200 ul of TE (10 mM Tris, 
[pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 urn with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
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cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
5 instructions of the manufacturer (Qiagen), with a final elution of 50 jil of 1 mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 

10 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ng/ml BSA, 100 \M of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I(New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

15 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 jil of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction 

20 contained 100 ng of vector DNA, 2 to 5 (il of repaired sonicated phage DNA (50-100 
ng) in a final volume of 20 jal containing 800 units of T4 DNA ligase (New England 
Biolabs) and was incubated overnight at 16°C. Transformation and selection of 
bacterial clones containing recombinant plasmids was performed in E. coli DH10(5 
according to standard procedures (Sambrook et al., 1989). 

25 Recombinant clones were picked from agar plates into 96-well plates 

containing 100 jil LB and 100 jig/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 jil reaction volume containing 10 mM Tris (pH 

30 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 jiM primer, 187.5 jaM each dNTP, 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cyclfcs of 30 sec 



denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of i 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303 152). To ensure co-linearity of the sequence data and 
10 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

15 

Example 13, Bioinformatic manaeement of primary nucleot ide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
20 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Enterococcus bacteriophage 182 is shown in 
Table 21. 

A software program was used on the assembled sequence of bacteriophage 182 
to identify all putative ORFs larger than 33 codons. The software scans the primary 

25 nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three 
possible selections can be made for defining the nature of the start codon; I) selection 
of ATG, II) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, 
CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one 
reported by the NCBI fhttp://www.ncbi. nlm.nih.gov/htbin- 

30 post/Taxonomv/wprintgc?mode=c ) for the bacterial genetic code. When an 

appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
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next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
5 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 182 are listed in Tables 22 & 23. 
Sequence homology searches for each ORF were carried out using an implementation 
of BLAST programs. Downloaded public databases used for sequence analysis 
include: 

10 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftpV/ncbi.nlm.nih.gov/blast/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
15 lk.fa); 

vi) streptococcus pyrogenes 

(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1121 97.Z); 

vii) PRODOM 

(ftp://ftp.toulouse.inra.fr/pub/prodom/current release/prodom99.1.forblast.gz ): 
20 viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/ V 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
1 82 are shown in Tables 24 & 26. 

25 Example 14. Sub-Cloning of Bacteriophage 182 ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 182 ORF sequence is inducible. 
For example, the plasmid pND50 replicates in E. coli, E.faecalis, and S. aureus 
30 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimocrob. Agents Chemother. 40, 1 157-1 163). This plasmid 
can be modified by conventional techniques to insert the inducible arsenite promoter, 
derived from the shuttle vector pT0021, in which the firefly luciferase (lucFF) 



expression is controlled by the ars promoter/operator from a S. aureus plasmid 
(Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). Recombinant luminescent 
bacteria for measuring bioavailable arsenite and antimonite. Appl Environ. Microbiol 
63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene 
5 and a cloning site for introduction of individuaLphage ORFs downstream from a 
shine-delgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

1 5 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transciption in Enterococcus. 

Alternatively, a constitutive promoter can be used (e.g„ the P-lactamase 

20 promoter is constitutive in EJaecalis - see ref. 1) to drive expression of the 

introduced ORF, and compare cell growth to control bacterial cells containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into E.faecalis strain FA2-2 by electroporation, as previously described 
(Yamagishi, J., Kojima, T., Oyamada, Y. f Fujimoto, K., Hattori, H., Nakamura, S., 

25 and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1 163). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 

30 of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
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the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH10p\ Recombinant clones are then picked and their insert sizes confirmed by 
PCR analysis using primers flanking the cloning site as well as restriction digestion. 
5 The sequence fidelity of cloned ORFs will be verified by DN A sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
primers will be selected and used for sequencing. Recombinant plasmids will be 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 
10 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1 163). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
assessed, for example, in either of the two methods. 
15 1 . Screening on agar p lates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of E. faecalis transformed cells containing phage 182 ORF onto agar plates containing 
different concentrations of sodium arsenite (0; 2.5; 5;.and 7.5 p.M). The plates are 
incubated overnight at 37°C, after which a growth inhibition of the ORF 
20 transformants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of growth inhibition in liqu id medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD 540 =. 2) with fresh media containing antibiotic 
25 and transferred to 96-well microtitration plates (100 uVwell). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 pM) and the culture incubated 
for an additional 2 h at 37°C. The effect of expression of the phage 182 ORFs on 
bacterial cell growth is then monitored by measuring the OD 540 and comparing the rate 
of growth to the culture not containing inducer. As positive controls for growth 
30 inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holinAysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
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Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
5 colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 

10 
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Example 15. Growth of Streptococcus bacteriophage Dp-1 an d purification of 
genomic DNA 

The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 
1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 

25 1975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be Used. 
Strain R36A is available from ATCC as #1 1733 or 27336. Streptococcus pneumoniae 
is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog 
number HER 1054. Other S. pneumoniae strains are also available from ATCC.) 
Two rounds of plaque purification of phage Dp-1 were performed on soft agar 

30 essentially as described in Sambrook et al. (1989). Briefly, the Streptococcus R6 PS 
strain was grown overnight at 37°C in K-Cat media [K-Cat: 10 g Bacto casitone, 5 g 
Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, 30mM 
Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer 
Mannheim #10683600). The culture was then diluted 20 fold in K-CAT and 



incubated at 37°C until the OD J40 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, Dp-1 phage was subjected to 10-fold serial dilutions 
using the phage buffer (100 mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM 
MgCl 2 )and 10 p.1 of each dilution was used to infect 0.5 ml of the cell suspension. 
5 After incubation of 15 min at 37°C, 2 ml of melted soft agar (K-CAT supplemented 
with 0.8% of agar) were added to the mixture and poured onto the surface of 1 00 mm 
K-CAT agar plates [K-CAT supplemented with 1.2 % of agar]. After solidification of 
the soft agar layer, an additional 5 ml of melted soft agar was added to visualize 
distinct plaques (Ronda et at, 1978), After overnight incubation at 37°C, a single 

1 0 plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 
2 hrs at room temperature, and the phage suspension was diluted and used for a 
second infection as described above. After overnight incubation at 37°C, a single 
plaque was isolated and used as a stock for all subsequent manipulations. 

The propagation procedure for bacteriophage Dp-1 was modified from the 

15 agar layer method of Swanstorm and Adams (1951). Briefly, the R6 strain of 

Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- 
CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37°C until the 
OD 540 = 0.2. The suspension (15xl0 7 Bacteria) was then mixed with 15x10 s plaque 
forming units (pfu) to give a ratio of 100-bacteria/pfu. After incubation of 15 min at 

20 37°C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture 
and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 
37°C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added 
to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each 
plate and the soft agar layers were collected by scrapping off with a clean microscope 

25 slide followed by vigorous shaking of the agar suspension for 5 min to break up the 
agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a 
JA-10 rotor (Beckman) and the supernatant (lysate) was collected and subjected to a 
treatment with 10 ug /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) of PEG 8000 and 

30 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 
mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM MgCl 2 ). The phage suspension 
was extracted with 1 volume of chloroform and further purified by centrifugation on a 

35 cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 
rpm (67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 



isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 raM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 pg/ml Proteinase K and 0.5% SDS and 
incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 16. DNA sequencing of the Bacteriophage E>p-1 genome. 

Four micrograms of phage DNA was diluted in 200 pi of TE (10 mM Tris, 
[pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 

15 amplitude of 3 pm with bursts of 5 sec spaced by 15 sec cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 

20 instructions of the manufacturer (Qiagen), with a final elution of 50 pi of 1 mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 pi) 

25 containing sonicated phage DNA, 1 0 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 pg/ml BSA, 100 pM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

30 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 pi of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation 

35 reaction contained 100 ng of vector DNA, 2 to 5 pi of repaired sonicated phage DNA 
(50-100 ng) in a final volume of 20 pi containing 800 units of T4 DNA ligase (New 
England Biolabs) and was incubated overnight at 16°C. Transformation and selection 



of bacterial clones containing recombinant plasmids was performed in E. coli DH10P 
according to standard procedures (Sambrook et ai, 1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ul LB and 100 |ig/ml ampicillin and incubated at 37°C. The presence 

5 of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 ul reaction volume containing 10 mM Tris (pH 
8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uM each dNTP, 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 

10 follows: 2 min initial denaruration at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 

\ 5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 

20 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

25 

Example 17. Bioinformatic management of primary nucleo tide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge 
of the contigs. Phage DNA was used directly as sequencing template employing ABI 
30 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in 
Table 28. 

A software program was used on the assembled sequence of bacteriophage 
Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the 
3 5 primary nucleotide sequence starting at nucleotide # 1 for an appropriate start codon. 
Three possible selections can be made for defining the nature of the start codon; 1) 
selection of ATG, II) selection of ATG or GTG, and 111) selection of either ATG, 
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GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
to the one reported by the NrRI( http://www.ncbi.n lm.nih.gov/htbin- 
post/Taxonomv/wprintgc?mode=c ) for the bacterial genetic code. When an . 
appropriate start codon is encountered, a counting mechanism is employed to count 

5 the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 

10 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6. 

Sequence homology searches for each ORF were carried out using an 
implementation of BLAST programs. Downloaded public databases used for 
sequence analysis include: 

15 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.govftlast/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaa Z); 

v) staphylococcus aureus NCTC 8325 
20 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes 
(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1 12197.Z); 

vii) PRODOM 

r ftp://ftp.toulouse.inra.fr/Dub/prodom/current relea se/prodom99.1.forblast.gz); 
25 viii) DOMO / ftp://ftp.infohiogen.fr/Dub/db/domoA : 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
Dp-1 are shown in Table 31. 

30 

F.xample 18. Sub-Cloning of Bacteriop hage Dp-1 ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage Dp-1 ORF sequence is inducible. 
35 For example, the plasmid pLSE4 replicates in E. coli, and S. pneumoniae (Diaz and 
Garcia, 1990). This plasmid can be modified by conventional techniques to insert the 
inducible arsenite promoter, derived from the shuttle vector pT0021, in which the 
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firefly luciferase (lucFF) expression is controlled by the ars promoter/operator from a 
S. aureus plasmid (Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). 
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. 
Appl. Environ. Microbiol. 63:4456-4461). This modified shuttle vector will contain 
5 the ars promoter, arsR gene and a cloning site for introduction of individual phage 
ORFs downstream from a shine-dalgamo sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae. 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

15 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transcription in Streptococcus. 

Alternatively, a constitutive promoter can be used to drive expression 

20 of the introduced ORF, and compare cell growth to control bacterial cells containing 
the parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) 

Cloning of ORFs with a Shine-Dalgarno sequence 

25 ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 

bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 

30 codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed 

35 by PCR analysis using primers flanking the cloning site as well as restriction 

digestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing 
using the same primers as used for PCR. In the cases that the verification of ORFs 
can not be achieved by one path of sequencing using primers flanking the cloning site 



internal primers will be selected and used for sequencing. Recombinant plasmids will 
be introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
5 assessed, for example, in either of the two methods. 

1 . Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an 
aliquot of S. pneumoniae transformed cells containing phage Dp-1 ORFs onto agar 
plates containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 jiM). 
10 The plates are incubated overnight at 37°C, after which a growth inhibition of the 
ORF transformants on plates that contain arsenite are compared to plates without 
arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 

15 37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD 540 =.2) with fresh media containing antibiotic 
and transferred to 96-well microti tration plates (100 |il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 jiM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage Dp-1 ORFs on 

20 bacterial cell growth is then monitored by measuring the OD 540 and comparing the rate 
of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the fcilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holinAysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 

25 Maier, SK. and Scherer, S. 1998. F EMS Microbiology Letters #162:265-274) can be 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 

30 detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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All patents and publications mentioned in the specification are indicative of 

10 the levels of skill of those skilled in the art to which the invention pertains. All 

references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, 

15 as well as those inherent therein. The specific methods and compositions described 
herein as presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other uses 
will occur to those skilled in the art which are encompassed within the spirit of the 
invention are defined by the scope of the claims. 

20 It will be readily apparent to one skilled in the art that varying substitutions 

and modifications may be made to the invention disclosed herein without departing 
from the scope and spirit of the invention. For example, those skilled in the art will 
recognize that the invention may suitably be practiced using a variety of different 
bacteria, bacteriophage, and sequencing methods within the general descriptions 

25 provided. 

The invention illustratively described herein suitably may be practiced in the 
absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising," "consisting essentially of and "consisting of* may be replaced with 

30 either of the other two terms. The terms and expressions which have been employed 
are used as terms of description and not of limitation, and there is not intention that in 
the use of such terms and expressions of excluding any equivalents of the features 
shown and described or portions thereof, but it is recognized that various 
modifications are possible within the scope of the invention claimed. Thus, it should 

35 be understood that although the present invention has been specifically disclosed by 
preferred embodiments and optional features, modification and variation of the 
concepts herein disclosed may be resorted to by those skilled in the art, and that such 
modifications and variations are considered to be within the scope of this invention as 
defined by the appended claims. 

Ill 



In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. For. example, 

5 if there are alternatives A, B, and C, all of the following possibilities are included: A 
separately, B separately, C separately, A and B, A and C, B and C, and A and B and 
C. Thus, for example, for the bacteria and phage specified herein, the embodiments 
expressly include any subset or subgroup of those bacteria and/or phage. While each 
such subset or subgroup could be listed separately, for the sake of brevity, such a 

10 listing is replaced by the present description. 

Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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CLAIMS 



What is claimed is: 

5 ^ A method for identifying a bacteriophage coding region encoding a 

product active on an essential bacterial target, comprising identifying a nucleic acid 
sequence encoding a gene product which provides a bacteria-inhibiting function when 
y.) ; ,!■ '■' said bacteriophage infects a host bacterium, 

wherein said bacteriophage is uncharacterized and said host bacterium 
10 is a pathogenic bacterium. 

j 2. The method of claim 1, further comprising expressing a recombinant 

bacteriophage ORF in cells of a bacterial strain, wherein inhibition of said cells 
foHuwing expression of said ORF is indicative that said product is active on an 
^ 15 7 essentiai)bacterial target. 

3 . The method of claim 2, wherein inhibition of said bacterium following 
expression of said ORF is determined by comparison with the growth or viability of 
said bacterium following expression of an inactivated mutant form of said ORF or in 
20 the absence of expression of said ORF, and wherein inhibition of said bacterium 
following expression of said ORF is indicative that said product is active on an 
essential bacterial target. 



25 



4. The method of claim 2, wherein expression of said ORF is inducible. 

5, The method of claim 1 , further comprising sequencing at least a 
portidrTof a bacteriophage genome. 

The method of claim 1 , wherein at least a portion of the nucleotide 
30 sequence of a bacteriophage genome is known, said method further comprising 
identifying at least one ORF in said portion by computer analysis of said sequence. 

J? The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify 
35 homologous genes or gene products of known biochemical function, thereby 
indicating the biochemical function of said polypeptide. 
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The method of claim 7, wherein said homologous gene or gene product 
is a bacterial gene important for cell viability. 



The method of claim 7, wherein said homologous gene or gene product 
5 is a gene or gene product known to have a bacteria-inhibiting function. 

y(. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify structural 
motifs in said polypeptide, thereby indicating the cellular function of said polypeptide. 



10 



\Xf The method of claim 1, wherein a host bacterium for said 
bacteriophage is selected from the species group consisting of bacteria listed in Table 



1 5 jrf^ The method of claim 1 , wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage listed in Table 1. 

13. The method of claim 2, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

20 

14. The method of claim 13, wherein each of said plurality of 
bacteriophage ORFs is expressed in a different bacterium. 

15. The method of claim 14, wherein said plurality of bacteriophage ORFs 
25 comprises at least 10% of the ORFs in the genome of said bacteriophage. 

The method of claim 1, wherein said pathogenic bacterium is an animal 

pathogen. 

30 Yk y The method of claim 16, wherein said pathogenic bacterium is a human 

pathogen. 

>8: The method of claim 1, wherein said pathogenic bacterium is a plant 
pathogen. 



35 



yf. The method of claim 1, further comprising confirming the inhibitor 



function of said ORF. 
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yX? The method of claim 19, wherein said confirming comprises 
expressing a loss-of-fiinction mutant form of said ORF in said host bacterium. 

5 The method of claim 1 , wherein said identifying a nucleic acid 

sequence encoding a gene product active on an essential bacterial target comprises 
identifying a nucleic acid sequence encoding a homolog of a bacteriophage 
polypeptide known to be active on an essential bacterial target. 

1 o /pf. The method of claim 1 , wherein said identifying a bacteriophage 

coding region comprises identifying a first coding region from a bacteriophage having 
a non-pathogenic host bacterial strain related to said pathogenic bacterium, said first 
coding region encoding a product active on an essential bacterial target; and 
identifying a homolog of said first coding region, wherein said 

15 homolog is a probable said bacteriophage coding region encoding a product active on 
an essential bacterial target. 

23. The method of claim 2, wherein a plurality of bacteriophage ORFs 
from a plurality of different bacteriophage are expressed in at least one bacterium. 



20 



24. The method of claim 23, wherein each of said plurality of 
bacteriophage ORFs are expressed in different bacteria. 



25 ly^ A method for identifying a target for antibacterial agents, comprising 

determining the bacterial target of an uncharacterized bacteriophage inhibitor protein. 

^ The method of claim 25, wherein said determining comprises 
identifying at least one bacterial protein which binds to said bacteriophage inhibitor 
30 protein or a fragment thereof. 

The method of claim 26, wherein said binding is determined using 
affinity chromatography on a solid matrix. 

35 ^ The method of claim 25, wherein said determining comprises 

identifying at least one proteimprotein interaction using a genetic screen. 
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hybrid screen. 



The method of claim 28, wherein said genetic screen is a yeast two- 



^ The method of claim 25, wherein said determining comprises a co- 
immunoprecipitation assay or a protein-protein crosslinking assay. 

if. The method of claim 25, wherein said determining comprises 
identifying a mutated bacterial coding sequence which protects a bacterium from said 
bacteriophage inhibitor. 

32: The method of claim 25, wherein said determining comprises 
identifying a bacterial coding sequence which protects a bacterium against said 
bacteriophage inhibitor when expressed at high levels in said bacterium. 

,33 ! The method of claim 25, wherein said determining further comprises 
identifying a bacterial nucleic acid sequence encoding a polypeptide target of said 
bacteriophage inhibitor protein. 

,34. The method of claim 33, wherein said nucleic acid sequence is 
identified by determining at least a portion of the amino acid sequence of a bacterial 
protein target, and identifying a bacterial nucleic acid sequence which encodes said 
protein target. 

35. The method of claim 25, wherein said bacterial target is naturally 
produced by a bacterial species selected from the group consisting of species of the 
genera listed in Table 1. 

36: The method of claim 25, wherein said bacterial target is naturally 
produced by a bacterial strain selected from the group consisting of species listed in 
Table 1. 

37. The method of claim 25, wherein said inhibitor protein is naturally 
produced by a bacteriophage selected from the group consisting of uncharacterized 
bacteriophage listed in Table 1 . 

38/ The method of claim 25, further comprising identifying a 
bacteriophage ORF which encodes a product having a bacteria-inhibiting -function. 
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}9\ The method of claim 38, wherein said identifying a phage ORF 
comprises expressing at least one bacteriophage ORF in a bacterium, wherein 
inhibition of said bacterium following said expression is indicative that said ORF 
5 encodes a bacteria-inhibiting function. 

£Q\ The method of claim 39, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

1 0 41 . The method of claim 40, wherein each of said plurality of 

bacteriophage ORFs is expressed in a different bacterium. 

42. The method of claim 41 , wherein said plurality of bacteriophage ORFs 
comprises at least 10% of the ORFs in the genome of said bacteriophage. 

15 

43. The method of claim 25, wherein said determining the bacterial target 
of a bacteriophage inhibitor protein is performed for a plurality of different 
bacteriophage of the same host bacterium. 

20 44. The method of claim 25, wherein said bacterial target originates from 

an animal pathogen. 

45 . The method of claim 44, wherein said bacterial target is a gene 
homologous to a gene from an animal pathogen. 

25 

$6. The method of claim 44, wherein said pathogen is a human pathogen. 

4f ". The method of claim 25, wherein said bacterial target originates from a 
plant pathogen. 

30 

48. The method of claim 25, wherein said bacterial target is a gene 
homologous to a gene from a plant pathogen. 

4^. The method of claim 25, further comprising determining the cellular or 
35 biochemical function or both of said inhibitor protein. 
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The method of claim 25, wherein said identifying the bacterial target 
comprises identifying a phage-specific site of action. 

5 pi. An isolated, purified, or enriched nucleic acid sequence at least 15 

nucleotides in length, wherein said sequence corresponds to at least a portion of a 
bacteriophage sequence, and wherein said bacteriophage is selected from the group 
consisting of Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 

10 

52. The nucleic acid sequence of claim 5 1 , wherein said sequence 
comprises at least 50 nucleotides. 

53 . The nucleic acid sequence of claim 51, wherein said nucleic acid 
15 sequence corresponds to at least a portion of a nucleic acid sequence which encodes a 
product which provides a bacteria-inhibiting function. 

54. The nucleic acid sequence of claim 53, wherein said nucleic acid 
sequence encodes a polypeptide which provides a bacteria-inhibiting function. 

20 

55. the nucleic acid sequence of claim 54, wherein said nucleic acid 
sequence is transcriptionally linked with regulatory sequences enabling induction of 
expression of said sequence. 

25 

56. An isolated, purified, or enriched polypeptide comprising at least a 
portion of a protein providing a bacteria-inhibiting function, wherein said polypeptide 
is normally encoded by a bacteriophage selected from the group consisting of 
Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, Enterococcus 

30 baceriophage 1 82, arid Streptococcus pheumoniae bacteriophage Dp- 1 . 

57. The polypeptide of claim 56, wherein said polypeptide provides said 
bacteria-inhibiting function. 

35 58. The polypeptide of claim 56, wherein said polypeptide comprises a 

portion at least 10 amino acid residues in length of a said polypeptide normally 
encoded by said bacteriophage. 
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^ff. A recombinant vector comprising a bacteriophage ORF corresponding 
to an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 
5 bacterial host is selected from the group consisting of uncharacterized bacteria of 
Table 1. 

6fi r . The vector of claim 59, wherein said vector is an expression vector. 

10 JA. The vector of claim 59, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage of Table 1 . 

£2. The vector of claim 6 1 , wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, 
1 5 Enterococcus baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

.63. The vector of claim 60, wherein expression of said ORF is inducible. 

/ 

20 A recombinant cell comprising a vector, wherein said vector comprises 

an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 
bacterial host is selected from the group consisting of bacterial species of Table 1. 

The recombinant cell of claim 64, wherein said bacteriophage is 
25 selected from the group consisting of uncharacterized phage of Table 1. 

66' The cell of claim 65, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 
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67. The cell of claim 64, wherein said vector is an expresssion vector and 
expression of said ORF is inducible. 



35 ^jb&f A method for identifying an antibacterial agent, comprising identifying 

an active portion of a product of a bacteria-inhibiting ORF of a bacteriophage. 
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• 69. / The method of claim 68, further comprising constructing a synthetic 
peptidomimetic molecule, wherein the structure of said molecule corresponds to the 
structure of said active portion. 



; ^ jti. A method for identifying a compound active on a target of a 

bacteriophage inhibitor protein, comprising the step of 

contacting a bacterial target protein with a test compound; and 
determining whether said compound binds to or reduces the level of 

activity of said target protein, 

wherein binding of said compound with said target protein or a 

reduction of the level of activity of said protein is indicative that said compound is 

active on said target and wherein said target is uncharacterized. 

7 1 . The method of claim 70, wherein said contacting is carried out in vitro. 

72. The method of claim 70, wherein said contacting is carried out in vivo 
in a cell. 

73. The method of claim 70, wherein said compound is a small molecule. 

74. The method of claim 70, wherein said compound is a peptidomimetic 
compound. 

75. The method of claim 70, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

76. The method of claim 70, further comprising determining the site of 
action of said compound on said target protein. 

77. The method of claim70, wherein said contacting is performed for a 
plurality of said target proteins. 



78. A method of screening for potential antibacterial agents, comprising 
the step of determining whether any of a plurality of compounds is active on a target 
of a bacteriophage inhibitor protein, 
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wherein said target is naturally produced by a pathogenic bacterium. 

79. The method of claim 78, wherein said plurality of compounds are 
small molecules. 

5 

80. The method of claim 78, wherein said determining is performed for a 
plurality of said targets. 



10 81. A method for inhibiting a bacterium , comprising the step o f; 

contacting said bacterium with a compound active on a target of a 
bacteriophage inhibitor protein, wherein said target or the target site is 
uncharacterized. 

15 82. The method of claim 81, wherein said compound is said protein or an 

active fragment thereof. 

83. The method of claim 81, wherein said compound is a structural 
mimetic of said protein. 

20 

84. The method of claim 8 1 , wherein said compound is a small molecule. 

85. The method of claim 81, wherein said contacting is performed in vitro. 

25 86. The method of claim 8 1 , wherein said contacting is performed in vivo 

in an animal. 

87. The method of claim 86, wherein said animal is a human. 

30 88. The method of claim 8 1 , wherein said contacting is carried out in vivo 

in a plant. 

89. The method of claim 8 1 , wherein said bacterium is selected from the 
group of bacteria listed in Table 1. 

35 
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90. A method for treating a bacterial infection in an animal suffering from 
an infection, comprising administering to said animal a therapeutically effective 
amount of compound active on a target of a bacteriophage inhibitor protein in a 
bacterium involved in said infection, 

5 wherein said target is an uncharacterized target or the compound is active at an 

uncharacterized target site. 

91 . The method of claim 90, wherein said compound is a small molecule. 

10 92. The method of claim 90, wherein said compound is a peptidomimetic 

compound. 

93. The method of claim 90, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

15 

94. The method of claim 90, wherein said animal is a mammal. 

95. The method of claim 94, wherein said mammal is a human. 

20 96. The method of claim 90, wherein said bacterium is selected from the 

group listed in Table 1. 

97. The method of claim 90, wherein said bacteriophage inhibitor protein 
is from a bacteriophage selected from the group of bacteriophage listed in Table 1. 

25 

98. A method for propylactically treating an animal at risk of an infection, 
comprising administering to said animal a prophylactically effective amount of a 
compound active on a target of a bacteriophage inhibitor protein, 

30 wherein said target is an uncharacterized target or the site of action of 

said compound is an uncharacterized target site. 

99. The method of claim 98, wherein said compound is a small molecule. 

35 100. The method of claim 98, wherein said compound is a peptidomimetic 

compound. 
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101. The method of claim 98, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 



102. The method of claim 98, wherein said animal is a mammal. 

103. The method of claim 102, wherein said mammal is a human. 



104. An antibacterial agent active on a target of a bacteriophage inhibitor 
10 protein, wherein said target is an uncharacterized target or said agent is active at a 

phage-specific site on said target. 

105. The agent of claim 104, wherein said agent is a pepetidomimetic of a 
bacteriophage inhibitor polypeptide. 
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106. The agent of claim 104, wherein said agent is a small molecule. 

1 07. The agent of claim 1 04, wherein said agent is a fragment of a 
bacteriophage inhibitor polypeptide. 

108. The agent of claim 104, wherein said agent is active at a phage-specific 
site on said target. 



25 109. A method of making an antibacterial agent, comprising the steps of: 

a) identifying a target of a bacteriophage inhibitor polypeptide; 

b) screening a plurality of test compounds to identify a compound 
active on said target; and 

c) synthesizing said compound in an amount sufficient to provide a 
30 therapeutic effect when administered to an organism infected by a bacterium naturally 

producing said target 

1 10. The method of claim 109, wherein said compound is a small molecule. 

35 Hi. The method of claim 109, wherein said compound is a peptidomimetic 

compound. 



123 



112. The method of claim 109, wherein said compound is a fragment or 
derivative of a bacteriophage inhibitor protein. 



5 1 1 3 . A computer readable device having recorded therein a nucleotide 

sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus 
bacteriophage 77, bacteriophage 3 A, or bacteriophage 96, a nucleotide sequence at 
least 95% identical to a said nucleotide sequence, a ribonucleic acid equivalent, a 
degenerate equivalent, a homologous sequence, or at least one amino acid sequence 
10 encoded by said nucleotide sequence; and 

a nucleotide sequence or amino acid sequence analysis program, 
wherein said program can perform at least one sequence analysis on said 
nucleotide or amino acid sequence. 

15 114. The device of claim 1 13, wherein said at least a portion of at least one 

bacteriophage genome comprises at least one ORF. 

115. The device of claim 1 13, wherein said device comprises a medium 
selected from the group consisting of floppy disk, computer hard drive, optical disk, 

20 computer random access memory, and magnetic tape wherein said nucleotide or 
amino acid sequence or said program or both are recorded on said medium. 

1 1 6. The device of claim 113, wherein said portion of at least one 
bacteriophage genomic nucleotide sequence comprises at least 50% of at least one 

25 bacteriophage genomic sequence. 

1 1 7. The device of claim 113, wherein said at least one bacteriophage 
nucleotide genomic sequence comprises portions of a plurality of bacteriophage 
nucleotide genomic sequences. 

30 

118. A computer-based system for identifying biologically important 
portions of a bacteriophage genome, comprising: 

a) a data storage medium having recorded thereon a nucleotide sequence 
35 corresponding to a portion of at least one bacteriophage genome, wherein said 
bacteriophage genome is uncharacterized; 
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b) a set of instructions allowing searching of said sequence to analyze said 
sequence; and 

c) an output device. 



5 119. The system of claim 118, wherein said output device comprises 

comprises a device selected from the group consisting of a printer, a video display, 
and a recording medium. 

120. The system of claim 118, wherein said bacteriophage genome is of a 
10 bacteriophage selected from the group consisting of uncharacterized bacteriophage 

listed in Table 1 . 

121. The system of claim 1 1 8, wherein said uncharacterized bacteriophage 
is selected from the group consisting of bacteriophage 77, 3 A, and 96. 

15 



122. A method for identifying or characterizing a bacteriophage ORF, 
comprising the steps of: 

a) providing a computer-based system for analyzing nucleic acid or 
20 amino acid sequence data, wherein said system comprises a data storage medium 

having recorded thereon at least one nucleotide or amino acid sequence corresponding 
to a portion of at least one uncharacterized bacteriophage genome, a set of instructions 
allowing searching of said sequence to analyze said sequence; and an output device; 

b) analyzing at least a portion of at least one said sequence; and 
25 c) outputting results of said analyzing to said output device. 

123. The method of claim 122, wherein said analysis identifies sequence 
similarity or homology with sequences selected from the group consisting of bacterial 
ORFs encoding products with related biological function; ORFs encoding known 

30 inhibitors or bacteria, essential bacterial ORFs. 

124. The method of claim 122, wherein said analysis comprises identifying 
a probable biological function based on identification of structural elements or 
sequence homology or similarity. 

35 

125. The method of claim 122, wherein said bacteriophage is selected from 
the group consisting of uncharacterized bacteriophage listed in Table 1 . - 

125 



126. The method of claim 125, wherein said uncharacterized bacteriophage 
is selected from bacteriophage 77, 3A, and 96. 
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ABSTRACT 



A method for identifying suitable targets for antibacterial agents based on 
5 identifying targets of bacteriophage-encoded proteins is described. Also described are 
compositions useful in the identification methods and in inhibiting bacterial growth, 
and methods for preparing and using such compositions. 
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Table! 



Phages against human and animai pathogenic bacteria 



Pathogen 
name 



Acinetobacter 
calcoaceticus 



Acinetobacter 
haemolyticus 



Acinetobacter 
I johnsonii 

Acinetobacter sp. 



Phage name 

A3/2~ 
A10/45 
A36 
B9GP 
B,PP 
BS46 

E13 

E14 

S31_ 
Ap3 
P78 



n. 



Cat 
alo 



BP1 



G4.HP2, HP3& 
HP4 



ictinobacillus 

■^omycetecomitans 



Al, A4, A9 & 
196 



A19, A23, A29, 
A3t.A33.A34, 
A3759 & 2845 



Origin/reference 



Felix d'Herelle Reference 
Centre.Quebec.Quebec 



J. Bacterid 1984. 157. 179-183 

j ; GenJvlic i ob^^ 
Felix d'Herelle Reference 
Centre.Quebec.Quebec 



Felix d'Herelle Reference 
Centre.Quebec.Quebec 



IVirdJ9i8j27l6-722 

JVirol.l974.13:46-52& 
Arch.ViroU994.l35:345-354 



I^rolc^^ 
CRHebdo Seances Acad.Sc.SerD.Sa 
Satur(Paris)278:1907-1909& 
Arch.ViroU994.135:345-354& 
Rev Can^ioUm29^lTO^ 
^^^ioiuttl994- 119:329-337 









Infec. Immun. 1982. 35: 343-349 








Mol.Gen.Genet 1998.258: 323-325 




Aaq>247 




Oral Micriol. Immunol 1997.12: 40-46 


Actinomyces viscosus 




43146-B1 


The American Type Culture Collection 








Infectimmun. 1985.48:228-233 








Infectlmmun. 1 988.56:54-59 








Plasmid 1997.37:141-153 


Aeromonas hydrophila 


PM2** & PM3 




FEMS Microbiol.Lett. 1990.57:277-282 


Aehl 
Aeh2 
PM4 
PM5 
PM6 
T7-ah 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 
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Aeromonas 
salmonicida 


3 

25 
29 
31 
32 

40RR 2 .»t 
43 
51 
56 
59.1 
65 

Asp37 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


55R.1 




Can. J. Microbiol. 1983. 29: 1458-1461 


Alteromonas espejiana 


PM2** 


27025-B1 


The American Type Culture Collection 


Asticacaulis 
biprosthecum 






Felix d'Herelie Reference 
Centre,Quebec,Quebec 


Asticcacaulis 
excentricus 


<|>Ac21 
<(>Ac24 


1 JZOI-DI 

15261-B2 
15261-B3 


TV»*» Am^nVan Tvnp Culture f nllection 


Azotobacter vinelandii 


A14 
A21 
A31 
A41 
PAVl 


12518-B1 
12518-B4 
12518-B5 
12518-B9 
12518-BIO 

1 "J7AC D 1 

13705-B1 


The American Type Culture Collection 


Azotobacler sp. 






Virology 1972.49:439-452 


Bacteroides fragilis 

- 


Bf-1 




Rev. Infect Dis. 1979. 1: 325-336 


B40-8 




FEMS Microbiol. Lett. 1991. 66: 61-67 


HSP40 




Appl. Environ. Microbiol. 1989. 55: 2696- 
2701 


phiAl 




ZentralbLbakteriol. 1972.222:57-63 


Bdellovibrio 
bacteriovorus 


MAC-1 




J. Gen. Microbiol. 1987. 133: 3065-3070 


Bdellovibrio sp. 


VL-1 




J.Virol.l973.12:1522-1533 


Bordetella 
brochiseptica 


214 




Zh.Mikrobiol.Epidemiol.Immuno. 1987.5:9- 
13 
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Bordetella 
parapertussis 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 








Mol. Gen. Mikrobiol. Virusol. 1 988.4: 2z-25 






■ 


Zh.Mikrobiol.Epidemiol.Immuno. 1987.5:9- 
13 




41405 




Zh.Mikrobiol.Epidemiol.Immuno. 1987.5:9- 
13 


Brucella abortus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 






23448-Bl 
23448-B2 
23448-B3 
17385-B1 
17385-B2 


The American Type Culture Collection 




10/1 

24/n 

212/XV 








BK-2.TB & 
Fi** 




Zh.Mikrobiol.Epidemiol.Immunobiol. 1983.2: 
48-52 




R/c&R/O 




Dev. Biol. Stand. 1984.56: 55-62 


Brucella canis 


R/c 




Dev. Biol. Stand. 1984.56: 55-62 


Brucella melitensis 


BK-2 


23456-B1 


The American Type Culture Collection 


Brucella suis 


Wb 




ZentralbLVeterinarmed. 1975.22:866-867 
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Fi** & TB 




Zh.Mikrobiol.Epidemiol.Immunobiol. 1 983.2: 








48-52 


Brucella sp. 






Can. J. Vet. Res. 1989.53: 319-325 








Res. Vet. Sci. 1988. 44: 45^9 




R 




Zh.Mikrtobiol.Epidemiol.Immunobiol.1983.2: 








48 


Campylobacter coli 




43133-B1 


The American Type Culture Collection 






43134-B1 




Campylobacter coli 


18 


43135-B1 


The American Type Culture Collection 


(Cont'd) 


19 


43136-B1 






20 






Campylobacter jejuni 


1 


35918-B1 


The American Type Culture Collection 


2 


35919-B1 






3 


35920-B1 






4 


35921-B1 






5 


35918-B2 






6 


35920-B2 






7 


35922-B2 






8 


35923-B1 






9 


35924-B1 






10 


35925-Bi 






11 


35925-B2 






12 


35922-B2 






13 


35924-B2 






14 








17 


43133-Bl 






18 


43134-Bl 






19 


ji lie n i 

43135-Bl 






20 


43136-B1 




Campylobacter 


HP1 




J. Med. Microbiol.1993. 38: 245-249 


(Helicobacter) pylori 








Chlamydia psittaci 


Chpl** 




J. Gen. Virol. 1989. 70: 3381-3390 


Clostridium 


CAK-1 




J.Bacteriol. 1993. 175:3838-3843 


acetobutylicum 
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Clostridium botulinum 






Nucleic Acids Res.l990.18:1291 








Bioch.Biophys.res.Commun.1990.171.1304- 
1311 








Microbiol. Immunol, lyoi.zj.y i j-yz / 








J. Vet.Med.Sci. 1992.54:675-684 j 




CE 3 &CEy 






Clostridium difficile 


41&56 




J. Clini.Microbiol. 1985.21:251-254 
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Clostridium 






Rev.Can.Biol. 1 977.36:205-2 15 


perfringens 


■■ 












FEMS Microbiol.Lett. 1990.54:323-326 


Clostridium 




8074-Bl 


The American Type Culture Collection 


sporogenes 


59 


17886-B1 






70 


17886-B3 






71 


17886-B4 






72S 


17886-B5 






72L 


17886-B6 




Clostridium tetani 


A & B 




Rev.Can.Biol.l978.37:43-46 


Corynebacterium 






Vopr. Virusol. 1 986.3 1:577-584 


diphteriae 








Corynebacterium 


NN 


12319-B1 


The American Type Culture Collection 


pseudotuberculosis 








Corynebacterium sp 


DLC 2921/49 


12052-B1 


The American Type Culture Collection 
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Enterococcus faecalis 


42 


19948-B1 


The American Type Culture Collection 


Enterococcus faecium 


124 
133 


19950-Bl 
19953-b2 
19953-Bl 


The American Type Culture Collection 
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Escherichia coli 



Escherichia coli 
(Cont'd) 



C204 

El 

fl** 

f2** 

FCZ 

fd** 



Ifl« 



MS2** 

MU9 

Mu-1 

0x6 

PI** 

P4 sidj ** 

Q-P** 

R17** 

Z1K/1 

ZJ/2 



11303-B14 

11303-B10 

11303-B21 

8677-B1 

11303-B13 

13706-B4 

15766-B1 - 

15766- B1 

1242-B5 
15669-B2 

15767- B1 
11303-B16 
27-65-B1 
25065-B2 
15669-B1 
15597-B1 
21816-Bl 
23724-B9 
15593-B1 
25404-B1 
29746-B1 
2363 1-B1 
25868-B1 
25298-B1 
25298-B2 
11303-B37 
11303-B24 
11303-B26 
11303-B27 
11303-B28 
11303-B29 
11303-B30 
U303-B33 
U303-B31 
U303-B25 
11303-B35 
U303-B34 
11303-B36 
U303-B32 
13706-B5 
11303-B1 
U303-B2 
U303-B3 
11303-B4 
3506O-B1 
35060-B2 
3506O-B3 
11303-B5 
11303-B6 
11303-B7 
m<VUR38 



The American Type Culture Collection 



The Americaa Type Culture Collection 
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Escherichia coli 
(Cont'd) 



547 
UV1 
UV47 
UV375 
a3** 
X ** 
XC-17 
X sus P-3 
X sus R-5 
X sus J-6 
X sus 0-8 
X sus A-l 1 
X ind" 
^92 

0C174** 
$Ccs70am-3 


11303-B20 

11303-B17 

11303-B15 

11303-Bll 

11303-B18 

13706-B2 

23724-B2 , 

23724-Bl 

23724-B3 

23724-B4 

23724-B5 

23724-B6 

23724-B7 

23724-B8 

35860-Bl 

13706-B3 

15597-B2 

I J /l/O-D 1 
H7O7O-D 1 


The American Type Culture Collection 






G4** & <#C** 




Biochim.Biophysica Acta. 1992. 1 1 30:277-288 


BF23** 




J.Bacteriol.l977.129:265-275 


Mul 




J.Ultrastruct.Res. 1 966. 14:44 1 -448 


Hpl7 




J.Mol.BioL1991.218:705-721 


K3**&Ox2** 




FEBS Lett.1987 .215:145-150 


Rbl8**,Rb51 & 
Rb69** 




J.Bacteriol. 1 990. 1 72: 180-186 


HI**, H3, H8, 
K9, 

K18 & 0x1 




Mol.Gen.Genet. 1990.22 1 :49 1-494 


Ml**, Tula** & 
Tulb** 




J.Mol.Biol.l987.196:165-174 


K10 




J.Bactenol. 1 979. 140:680-686 


Qsr* 




J.Bactenol. 1 985. 1 62:256-262 


B278 




J.Gen.Microbiol. 1988. 134:1333-1 338 


phi 80** 




FEMS MicrobioLLett. 1 994. 1 1 9:7 1 -76 


phiml73 




Genetika 1985.21:673-675 


tf-1 




J.Gen.Microbiol.l987.133:953-960 


P4 & phiR73 




Mol.Microbiol.l995.18:20l-208 


1,-2 




J.Gen.Microbiol.l982.128:2797-2804 


PRD1 




Virology 1990.177:445-451 


K3hx 




Mol.Gen.Genet. 1987.206:1 10-115 


Oil T**^ 

933W** 




Infprt Tmmiinitv lQRfi SVH5-140 


H19-B** 




J.Bacteriol.l987.169:4308-4312 


Tcp-111 




Zentralbnl.Bakteriol.Mikrobiol.Hyg.1988.270: 
41-51 
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N4« 



Phi80trp 



Obeta 1 



P1CM 



PA-2* 



186* 



186.IX.B 



21' 



P4* 



82 * 



PSP3 



HK022 



** 



D108* 



Vet.Microbiol.l992.30:203-212 



Ann.Inst.Pasteur.1971. 120:121-125 



J.Bacteriol.l978.133:172-177 



J.Gen.Microbiol. 1978.107:73-83 



J.Bacteriol. 1990. 172:1660-1662 



Mol.Gen-Genet. 1982.1 87:87-95 



Mol.Microbiol. 1 992.6: 2629-2642 



Virology 1983.129:484-489 



MicrobiolRev. 1 993.57:683-702 



J.Biol.Chem. 1 987.262: 1 1 72 1-11725 



J.Bacteriol. 1996. 178:5668-5675 



Nucleic Acids Res.l994.22:354-356 



Nucleic Acids Res.l986.14:3813-3825 



Escherichia coli 
(Cont'd) 



Rb49 



J.Mol.Biol.l997.267:237-249 



Ike* 



J.Mol.Biol.1985. 181:27-39 



P22dis 



Mol.Gen.Genet.l978.166:233-243 



N15« 



J.Bacteriol. 1 996. 178:1 484- 1 486 



IfV 



Proc.R.Soc.Lond.B.Biol.Sci. 1991 .245:23-30 



Stx2Phi-I & 
Stx2Phi-II 



Infectlmmun. 1998.66:41 00-4 1 07 



18 



Virology 1987.156:122-126 



J.Gen.Microbiol. 1981.1 26:389-396 



AC3 



Mol.Microbiol.l991.5:715-725 
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BW-1 




Felix d'Herelle Reference 




C-l 




Centre, Quebec,Quebec 




E920g 








Esc-7-11 








H19J 








Haiti ! 








HK243 | 








la i 








K20 








K30 








KL 3 








M 








Mu** 








O103 








0157:H7 








P1D 








ptl 








PilHa 








PR64FS 








PR772 








SS4 








p4Q 








Xvir** 








CIS 








09-1 








92 






Haemophilus 


HP1** 




Nucleic Acids Res. 1996.24:2360-2368 


influenzae 


S2** 




Gene 1997. 196: 139-144 


Halobacterium 


S45 




Felix d'Herelle Reference 


cutirubrum 






Centre,Quebec,Quebec 


Halobacterium 






• • lift ft T"% f* _ „ ^ 

Felix d Herelle Reference 


halobium 






Centre.Quebec.Quebec 








Can.J.Microbiol.l982.28:916-921 


Halobacterium 






Biol.Chem.Hoppe Seyler 1994.375:747-757 


salinarium 









140 



Klebsiella oxytoca 


tf-1 




J.Gen.Microbiol.l987.133:953-960 


Klebsiella pneumoniae 


60 
92 


23356- Bl 

23357- B1 


The American Type Culture Collection 




K19Q 




Felix d'Herelle Reference 
Centre.Ouebec.Ouebec 




FC3-1 & FC3-9 




Can.J.Microbiol.1991. 37:270-275 




FC3-10 




FEMS Microbiol.Lett.l991.67:29l-297 


Klebsiella sp. 


Kll** 




Mol.Gen.Genet. 1990.221:283-286 . ... 


Leptospira sp. 


LEI, LE3 & LE4 




Res.Microbiol. 1 990. 1 4 1 : 1 1 3 1 - 1 1 38 


Listeria 


243 


23074-B1 


The American Type Culture Collection 


monocytogenes 


197,1313 & 
9425 




Appl.Environ.Microbiol. 1 997.63 :3374-3377 




H387 & H387-A 




AppLEnviron.Microbiol. 1 993.59:29 1 4-29 1 7 




5775,6223 
& 12682 




APMIS.1993.101:160-167 , 




2389, 2671, 
4211 & 2685 




Intervirology 1994.37:31-35 & 
Zentralbl.Bakteriol.Mikrobiol.Hyg. 1986.26 1 : 1 
2-28 




4b, 4ab, 4g & 3c 




Ann.MicrobioUParis) 1977.128:185-198 




A118, A500 & 
A5U** \ 




IUr.1 N/firrnhinl 100S 16 1231-1241-992 




1 3 4 5 6 7 8 
9,10,11,14,15, 
16. 17, 19 & 20 




Ann.Microbiol. (Paris) 1979.1 30B: 179- 1 89 




l/2a, l/2b, 3c, 
4ab, 6a & 6b 




Clin.Invest.Med. 1984.7:229-232 




<|)LMUP35 
2685 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


f isterin innncua. 


4211 




Felix d'Herelle Reference 
Centre.QuebecQuebec 


Micrococcus luteus 


N3 
N4 
N8 


4698-B1 
4698-B4 
4698-2 
4698-B3 


The American Type Culture Collection 


Micrococcus luteus 


N17 




Can. J.MicrobioL 1 979.25 : 1 027- 1 035 


Mycobacterium 
smegmatis 


BK-3 
Bol** 
Bo 6 
Bo 611 
Bo 6111 
Mc-2 
Mc-4 
NN 

Phagus lacticola 
Rl 


27203- B1 

27204- B1 

27205- B1 
27205-B2 

607-B6 

607-B7 

11727-B1 

11759-B1 

607-B1 


The American Type Culture Collection 
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HER 3 17 
HER 330 
HER 333 
HER 335 
HER 334 
HER 331 
HER 3 16 - 


Felix d'Herelle Refrence 
Centre,Quebec,Quebec 




Legendre 
Leo 
Roy 
Sedge 












Mol.Microbiol. 1993.7:395-405 








1 MaI Rinl 1008 970- 141 1A4 

J.iVlOl.DlOl. 1 77G.4 fy. IHJ-IOH 








rTOC.INaU.ACaQ.oCi UoA. iy0o.01.^oJj-i0 J / 








Mol.Biol.Rep. 1981.30:11-15 . 








Proc.Natl.Acad.Sci.USA 1997.94:10961- 
10966 




29M.31M, 122, 
154, 37, 29D, 46, 
139,110,141, 
74D AOl A: 
DS6A 




Arch. Virol. 1993. 133:39-49 & 
Am.Rev.Respir.Dis. 1 975. 1 1 2: 1 7-22 


Mycobacterium 
fortuitum 


Bo 4 
Bo 7 


23052-B1 
27207-B1 
27207-B2 


The American Type Culture Collection 
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Mycobacterium leprae 






Ann.Microbiol. (Paris) 1982.133:93-97 


Mycobacterium 
luoercuiosis 


DS6A 


25618-Bl 

ZJDIO-DZ 

4243-B1 


The American Type Culture Collection 




110, 139&33D 




Arch. Virol. 1993. 133:39-49 




AG1.GS4E, 
BGl, 

PH & BKl 




The Biology of MycobacteriaAcademic 
Press.Toronto 1982 (Ratledge & Stanford) 

1982.309-351 


Mycobacterium sp 


Phagus pellegrini 

NN 

Bl 


11760-B1 
U761-B1 
23239-B1 


The American Type Collection Culture 
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TM4, ph60, 

ph72, 

PhAE39, 

phAE40 

&Bxbl 




Microbiology 1995.141:1173-1181 




C2 




Experentia 1969.25:1112-1113 




18&U5 




J.Gen. Virol. 1987.68:949-956 




63 




Gruzlica 1968.36:617-622 




phlei & 
butyricum 




J.Gen. Virol. 1975.29:235-238 




MyF3P-59a 




Z.AUg.Mikrobiol.l968.8:29-37 




Bo2a 




J.Gen. Virol. 1973.20:75-87 




D4.D28 & D32 




J.ExptLMed. 1966. 1 23:327-340 




HC 




J.Bacteriol. 1 963 .86: 608-609 


Mycobacterium 
vaccae 


B5 


15483-B1 


The American Type Culture Collection 


Mycobacterium phlei 


NN 
Bo 2 
Bo 2h 
Bo 3 


11728-Bl 
11758-Bl 
27086-B2 
27086-Bl 


The American Type Culture Collection 


Mycoplasma 
arthritidis 


MAV1** 




Infect.lmmunity. 1995.63:401 6-4023 


Mycoplasma hyorhinis 


Hr-1 




Arch.Virol.l983.77:81-85 


Mycoplasma 
pneumoniae 


Br-1 




Arch.Virol.l983.75:l-15 


Mycoplasma pulmonis 






Plasmid 1995. 33: 41-49 


Mycoplasma sp. 






J.Gen.Microbiol. 1 9 85 : 1 3 1 :3 1 1 7-3 1 2o 








J Virol 1986 59*584-590 








Gene 1994. 141: 1-8 
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• 


r. 


Microbios 1990. 64: 111-125 






Infection& Immunity 1995. 63: 4016-4023 






Med.Biol.1982.60: 116-120 


MV-L2& 




Arch. Virol. 1979.6 1 :289-296 






Acta. Virol. 1978.22:443-450 






J.Gen.Virol.l979.42:315-322 






Virology 1973.55:118-126 
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r. 


Science 1971.173:725-727 


Neisseria per/lava 






J.Clin.Microbiol.1976. 4:87-91 


Nocardia erythrypolis 


cpC 




J.Gen. Virol. 1974.23:247-254 




<pEC 




J.Bacteriol. 1 976. 1 26 : 1 1 04- 1 1 07 


Pasteurella multocida 


B225 




Arch.Exp.Veterinarmed.l981.35:433-436 




B939a 




Am.J.Vet.Res.l978.39:1565-1566 




Nos.115,32, 967 
& 

1075 




Vet.Med.Nauki. 1977.14:33-36 


Propionibacterium 
acnes 


NN 


29399-B1 


The American Type Collection Culture 



146 



Pseudomonas 
aeruginosa 





12175-B1 


The American Type Culture Collection 


2 


12175-B2 




2A 


12175-B3 




2B 


12175-B4 




11 


14205-B1 




16 


14206-B1 




24 


14207-B1 ft 




27 


14208-B1 




44 


14209-B1 




73 


14210-B1 




95 


1421 1-B1 




109 


14212-B1 




113 


14213-B1 




249 


14214-B1 




B3 


15692-B1 




Hoff2 


14203-B1 




Hoff3 


14204-B1 




Pa 


12055-B1 




Pb 


12055-B2 




PB-1 


15692-B3 




Pc 


12055-B3 




Pf 


25102-B1 




PP7** 


15692-B2 








Felix d'Herelle Reference 






Centre,Quebec,Quebec 


7 & 31 






PrJ** 




J. Virol. lyoJ.4 1 .11 \-lL3 


cp-MC 




^an.j.jviicrooioi. l^o?. i j. 1 1 /y-i i oo 


pn** 




J.Mol.Biol.l991.218:349-364 


PR4** 




J.Gen. Virol. 1979.43:583-592 


A7 




J.Bacteriol. 1992. 174:2407-241 1 


KF1 




J.Btdchem.l983.93:61-71 


<zcrx** 




MoLMicrobiol. 1993.4: 1703-1709 






J. Virol.1977.24: 135-141 
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cpKZ,21,(pNZ, 




dd< 




PMN17, PTB80, 








68, PB-1.E79, 








16, 








109, 352, 1214, 








F8.71.337, M4, 








<pC17, SL2.B17, 


r. 






Li-24, <pmnP78, 








P<?17** ml 73 








M6, Li-2, 7, 








<pmnF82, 








PTB2 PTB20 








PTB42, cpKJ 4 / 7, 








3 1 , PTB2 1 , 








1 1 Qv 








cpPLS27, B3, 








258, 








T_r„,1 T Xi\A <*7 

rlwlz, rM) /, 








rMOZ, rMlUj, 








148 PM681 








198, 








218 222 242 

ALOy 4.£.£.y £.1£.y 








246 








PC131,<pCU, 








SL5, 








D3112**,Jbl9, 








F7, 








PM69, PM13, 








PM61, PM113, 








(P240, 249 & 269 







148 



Pseudomonas 

aeruginosa 

(Cont'd) 



297, 309,318, 
11, 



Arch.Virol.l993.131:141-151 



149 



Pseudomonas cepacia 




- 


Felix d'Herelle Reference 
Ceatre,Quebec,Quebec 


P c at i A r\m n n n c Tf/lOl 
iStSuUUiTlO 71 u J jrugi 


wy 


27362-Bl 
27363 Bl 


The American Type Culture Collection 


^Pseudomonas 
phaseolicola 


<t>6 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Pseudomonas putida 


«h-l 


12633-B1 


The American Type Culture Collection 


Pseudomonas syringae 


£6 


40492-B1 
21781-B1 


The American Type Culture Collection 


Pc oi i/j/t m nn n c c n 

i o cuuu mu fiuo jfj> 


PPs-G3 


49780-B1 


The American Type Culture Collection 


kjLllfflUfltZitli UU 1 tZiltjf 


Sab 2 




Felix" d'Herelie Reference 
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Salmonella enteritidis 


1, 2,3 & 6 




Epidemiol.InfecM 995.1 14:227-236 


2a, 3a, 4a, 5a, 6a, 
7a, 8a, 9a, 15, 

17, L\J OCX. 1 




Vet.Med.Nauki.1975. 12:55-60 


Salmonella newington 


Epsilon 34 




J.Struct.Biol. 1995.115:283-289 


Salmonella newport 


16-19 


27869-B1 
27869-B2 


The American Type Culture Collection 
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Salmonella paratyphi 


Paratyphoid A 


19940-B1 
12176-B1 


The American Type Culture Collection 




Jersey 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella 
senftenberg 


SasLl, SaL2, Sal 
3, 

SaL4, SaL5 & 
SasL6 




Indian J.Med.Res. 1997.105:47-52 


Salmonella 
typhimurium 


P22** 
SL-1 


19585-B1 
40282 


The American Type Culture Collection 


MB78** 




J.ViroL 1982.41: 1038-1043 




SE1 




J.Gen.Microbiol.l986.132:1035-1041 




LT2 




Virology 1971.45:835-636 




ES18** 




Virology 1970.42:621-632 




L** 




J.Virol.l985.56:1034-1036 
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• 


PlCMclr-100 




Mol.Gen.Genet.l975.138:113-126 




F22 




Genet.Res. 1986.48: 139- 143 




Fels 1 




J.Gen. Virol. 1978.38:263-272 




Fels 2 




Genet.Res.l986.48:139-143 




Px 




Mol.Gen.Genet. 1 970. 1 08: 1 84-202 




Pike 




Virology 1974.60:503-514 




A3& A4 




J.Bacteriol. 1987.169:1003-1009 




HT 




GenetRes. 1976.27:3 15-322 


Salmonella 


IRA 




J.Basic Microbiol. 1990.30:707-716 


typhimurium 


Mudl 




Mol.Gen.Genet. 1986.202:327-330 


(Cont'd) 


P22 (cir4-l,cir5- 




Mol.Gen.Genet.l984.198:105-109 




1 &cir6-l) 








BF23** 




Mol.Gen.Genet. 1 976. 147: 1 95-202 




KM 




J.Bacteriol. 1 974. 1 1 7:907-908 




P221dis 




J.Gen. Virol. 1978.4 1:367-376 




PRD1** 




Virology 1990.177:445-451 




I 2 -2** 




J.Gen.Microbiol. 1 982 . 1 28:2797-2804 




tf-1 




J.Gen.Microbiol. 1987. 133:953-960 




X** 




J.GeaMicrobiol. 198 1 .126:389-396 


Salmonella 


8 


19937-B1 


The American Type Culture Collection 


typhosa/typhi 


23 


19938-B1 




25 


19939-B1 






46 


19942-B1 






53 


19943-B1 






163 


19946-B1 






175 


19947-B1 






Vil 


27870-B1 






ViVI 


27870-B2 






01 




Felix d'Herelle Refrence 








Centre.Quebec.Quebec 




vm 




Chung Hua Liu Hsing Ping 








H.T.C. 1992.13:288 




12 




J.Gen.Microbiol. 1983. 129:3395-33400 


Salmonella sp. 


P3 


25957-B1 


The American Type Culture Collection 




P4** 


25957-B2 






P9a 


25957-B3 






P9c 


25957-B4 






P10 


25957-B5 


- 




102 


19945-B1 






Chi(x) 


9842-B1 






R34 


97541 






MG40 




Virology 1968.34:521-530 




P14 




Microb.Pathog.l990.8:393-402 




PSP3 




Virology 1992.188:414 




Ike** 




ZentralbLBakteriol. 1 976.234 :294-304 




P27 & 9NA 




J.Virol. 1986.12:921-931 


Sphaerotilus natans 


SN1 




Appl.Environ.Microbiol. 1 979.37: 1 025- 1 030 
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Shigella dysenteriae 


P2 


23351-B1 

11456b 

11456a-Bl 


The American Type Culture Collection 


Shigella jlexeneri 


D20 


12661-B1 


The American Type Culture Collection 




Sfll** 




MoLMicrobiol. 1997.26:939-950 




SfV** 




Gene 1997.22:217-227 




Sf6** 




MoLMicrobiol. 1995. 18:201-208 




SfX 




Gene 1993.129:99-101 


Shigella sonnei 


C16** 






Ufa 




Mol..Biol (Mosk) 1977.11:323-331 


Shigella sp 


37 


23354-B1 


The American Type Culture Collection 


Spiroplasma citri 


SpVl 




Plasmid 1993.29:193-205 


Spiroplasma sp. 


SpVl-R8A2B 




Nucleic Acids Res. 1990.18:1293 


SpV3 




IsrJ.Med.Sci. 1987.23:429-433 




SpV4 




J.Bacteriol. 1987.169:4950-4961 


Staphylococcus albus 






Staphylococci & Staphylococcal 

Infections. 1997. 

Vol 1:503-508 (Karger.Basel) 
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Staphylococcus aureus 





Z 1 /UZ-B 1 




Z 1 1 \Jj Lj 1 




77704 PJ 

Z / / KJH'D 1 




Z J JOU-D 1 




777A1 ni 


1 < 

1 J 


z / /Uj-rJl 


17 


277IZ-B1 


7Q 


T7£OA d i 




Z /09 1-B I 


47F 


T7/CO') 111 

z /oyz-Bi 


47 


til 

z /oyj-Bi 


^7 
JZ 


Z /OV4-t51 




2769 j-oi 


JJ 


z /oyo-bi 




27697-B1 


JJ 


27698-Bl 


7 1 
/ 1 


27699-B1 


/ J 


27693-B2 


77 
/ / 


277UU-B 1 


7Q 


27701-B1 


oU 


T7*7A^C D 1 

277UO-B1 


5 1 


277U/-BI 


97 A 

OJA 


1T7AQ D 1 

27708-B1 


o4 


33742 




33741-B1 


88 


15565 


92 


19685-B1 


5504' 


11987-B1 


K 


U988-B1 


PI 


15752-B1 


P14 




UC18 





The American Type Culture Collection 
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- 


HER 101 
HER 239 
HER 283 
HER 49 
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Twort** 








$11** 




J.Bacteriol.l988.170:2409-2411 




(J)13**&(J)42** 




J.Gen. .Microbiol. 1989.135:1 679- 1 697 




L54a** 




J.BcterioL1986.166:385-391 




80a** 




Can. J.Microbiol. 1996.43:61 2-6 1 6 




94,95 & 96 




J.Clin.Microbiol.l988.26:2395-2401 




(pl31,A 3 & A 5 




Staphylococci & Staphylococcal 
Infections. 1997. 
Voll:503-508 (Karger,Basel) 




Phi FVL** 




Gene 1998.215:57-0/ 


Staphylococcus 
carnosus 


BaSTC2 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Staphylococcus 
epidermidis 


la, 2b, 3a, 4b, 
5a, oo, /o, oc, 
9a, 10a, llb,12a 
& 13b 




Can. J.Microbiol. 1 988.34: 1358- 1361 




41 63 1180 
138, 

245 336 392 & 
550 




Res Virol 1994 145 111-121 


Sta n h vlocncc u ? 
saprophytics 


11 54 A 1405 
1314, 1139 & 
1259 




Res. Virol. 1990. 141: 625-635 & 
Res.Virol. 1994. 145:1 11-121 


Staphylococcus sp. 


Phi 812, Phi 131, 
SK311&U16 




Virology 1998.246:241-252 


Streptococcus faecalis 


VD13 


HER44 


Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Streptococcus faecium 


PE1 




Zentralbl.Bakteriol. 1975.23 1 :42 1-425 


Streptococcus oralis 


Cp-l**&Cp- 
7** 




FEMS MicrobioLLett 1989.65:187-192 
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Streptococcus 
pneumoniae 


Cp-1** 


HER223 


Felix d'Herelle Reference 
Centre.Quebec.Quebec 


Cp-l**,Cp-5**, 
Cp-7** Cp-9?*, 

CO- 1 & 03-2 




J.Virol.l98t.40:551-559 & 
Eur.J.Biochem.l979.101:59-64 & 
Microbial Drug Resistance 1997.3:165-176 




HB-623 & HB- 




J.Virol.l990.64:5149-5155 




746 








fcJ - I 




T Rarfprinl 1 QQ? 174*5516-5525 




Dp-2 & Up- 4 




J. V HOI. iyf o.ZO.xZ 




Dp-1 




Virology iy /d.oj.j / f-joi 




co-3 & fl)-8 




J. Virol. 1976.19:659-667 




304 




J.Bacteriol.l980.141:1298-1304 




HB-l.HB-2, 




J.Bacteriol.l979.138:618-624 




HB-3**, 








HB-4, HB-5 & 








HB-6 






Streptococcus 
pyogenes 


T12** 




Mol. Microbiology. 1997#23:7l9-728 


A-1 
A-6 
A-25 
Kjem 


12202- Bl 

12203- Bl 

12204- Bl 
14918 


The American Type Culture Collection 


Streptococcus 
sp./Enterococcus 


1 

182 

VD1884 


HER 339 
HER 80 
HER 323 


Felix d'Herelle Refrence 
Centre, vjueoec,v^ue Dec 




1A 


12169-B1 


The American Type Culture Collection 




IB 


12170-B1 






NN 


21597-B1 






42 


19948-B1 






118 


1995 1-B2 






120 


19952-B1 




Veillonella rodentium 


N2 




Antonie Van Leeuwenhoek 1989.56:263-2/1 


Vibrio cholerae 


Psi 92 




InterviroloKV 1993.36:237-244 


VCB-1,2,3 & 4 




J.Infction 1998.36:131 




CP-T1** 




J.Virol. 1984.5 1:163-169 




VSK 




FEMS Microbiol.Lettl996.145:17-22 




Phi 138 




J.Virol. 1986.57:960-967 




Phi 149 




J.Virol.l985.140:217-223 




Fs-2** 


c 


Microbiology 1998.144:1901-1906 
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e4 




Felix d'Herelle Reference 




e5 




Centre,Quebec,Quebec 




X29 








P 








K 








13 








14 








16 








24 








32 








57 






Vibrio cholerae 


138 


14100-B1 


The American Type Culture Collection 


(Cont'd) 


145 


14100-B2 






149 


14100-B30 






163 


14100-B4 






N-4 


51352-B1 






S-5 


51352-B2 






S-20 


51352-B3 






M-4 


51352-B4 






D-10 


51352-B5 






I 


51352-b6 






n 


51352-B7 






in 


51352-B8 






IV 


51352-B9 






v 


51352-B10 ! 




Vibrio costicola 


UTAK 




Felix d'Herelle Reference 








Centre,Quebec,Quebec 


Vibrio eltor 


c 4 




JX3en.ViroU987.68:141 1-1416 


Vibrio natrigens 


ntl,nt6 




Felix d'Herelle Reference 






Centre,Quebec T Quebec 


Vibrio 


KVP40** 




Felix d'Herelle Reference 


parahaemolyticus 


V733 




Centre,Quebec,Quebec 




VP1 








4>60 








((•HAWI-5 








<|>PEL8C-1 






Vibrio sp. 


a3a 




Felix d'Herelle Reference 






Centre,Quebec,Quebec 




NN 


11985-B1 


The American Type Culture Collection 




phi 


51582-B1 






Phil49 




J.Virol. 1987.61:3999-4006 . 


Veillonella rodentium 


N2 




Antonie V.Leeuwenhoek. 1 989.56:263-27 1 
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Yersinia enterocolitica 


1 

2 
3 
4 
5 
6 
7 
8 
9 

d>Ye03-12 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




I, IV & vin 




Zentralbl.Bakteriol.Mifcrobiol.Hyg.l982.253:l 
02 


Yt>rsinin npstis 

iC/JlfUu ^/CJ»*J 


R 
S 
Y 


23208-B1 
11593-B1 
23053-Bi 


The American Type Culture Collection 




II 




ZhMikrobioLEpidemiol.Inununobiol. 1 990. 1 1 
Q 


Yersinia 

pseudotuberculosis 


PST** 


23207-B1 


The American Type Culture Collection 


Yersinia sp. 


RD2 




MoLGen.Mikrobiol.VirusoL 1990.8: 1 8-2 1 



xxxx) 
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Table 2 

bacteriophage 77 , complete genome sequence, 41708 nucleotides 

1 gatcaaaata cttggggaac ggttagggag taaacttcgc gataatttta aaaattcatg 

61 tataaccccc ctcttataac cattttaagg caggtgatga aatggagatt atagtcgatg 

121 aaaatttagt gcttaaagaa aaagaaaggc tacaagtatt atataaagac atacctagca 

181 ataaattaaa agtagttgat ggtttaatta ttcaagca^c aaggctacgt gtaatgcttg 

241 attacatgtg ggaagacata aaagaaaaag gtgattatga tttatttact caatctgaaa 

301 aggcgccacc atatgaaagg gaaagaccag tagccaaact atttaatg.ct agagatgctg 

3 61 catatcaaaa aataatcaaa caattatcgg atttattgcc cgaagagaaa gaagacacag 
421 aaacgccatc tgatgattac ctatgattag taataaatac gttgatgaat atataaattt 

4 81 gtggaaacaa ggaaagataa ttttaaataa agaaagaatt gatctcttta attatctaca 
541 aaaacatata tattcacgag atgatgtata ttttgatgaa cagaaaatcg aggattgtat 
601 caaatttatt gaaaaatggt attttccaac attaccattt caaaggttta tcatagctaa 
661 tatatttctt atagataaaa atacagatga agctttcttt acagaatttg ctattttcat 
721 gggacgtgga ggcgggaaaa acggtctaat aagtgctatt agtgattttc tttctacgcc 
781 cttacacgga gttaaagaat atcacatctc cattgttgct aatagtgaag atcaagcaaa 
841 aacatcgttt gatgaaatca gaaccgtttt aatggataac aaacgaaata agacgggtaa 
901 aacgccaaaa gctccttatg aagttagtaa agcaaaaata ataaaccgtg caactaaatc 
961 ggttattcga tataacacat caaacacaaa aaccaaagac ggtggacgtg aggggtgtgt 
1021 tatttttgat gaaattcatt atttctttgg tcctgaaatg gtaaacgtca aacgtggtgg 
1081 attaggtaaa aagaaaaata gaagaacgtt ttatataagt actgatggtt ttgttagaga 
1141 gggttatatc gatgcaatga agcacaaaat tgcaagtgta ttaagtggca aggttaaaaa 
1201 tagtagattg tttgcttttt attgtaagtc agacgatcca aaagaagttg atgacagaca 
1261 gacgtgggaa aaggcgaacc caatgttaca taaaccgtta tcagaatacg ctaaaacact 
1321 gctaagcacg attgaagaag aatataacga tttaccattc aaccgttcaa ataagcccga 
1381 attcatgact aagcgaatga atttgcctga agttgacctt gaaaaagtaa tagcaccatg 
1441 gaaagaaata ctagcgacta atagagagat accaaattta gataatcaaa tgtgtattgg 
1501 tggtttagac tttgcaaaca ttcgagattt tgcaagtgta gggctattat tccgaaaaaa 
1561 cgatgattac atttggttag gacattcgtt tgtaagacaa gggtttttgg atgatgtcaa 
1621 attagaacct cctattaaag aatgggaaaa aatgggatta ttgaccattg tcgatgatga 
1681 tgtcattgaa attgaatata tagttgattg gtttttaaag gctagagaaa aatatgggct 
1741 tgaaaaagtc atagctgata attatagaac tgatattgta agacgtgcgt ttgaggatgc 
1801 tggcataaaa cttgaagtac ttagaaatcc aaaagcaata catggattac ttgcaccacg 
1861 tatcgataca atgtttgcga aacataacgt aatatatgga gacaatcctt tgatgcgttg 
1921 gtttactaat aatgttgctg taaaaatcaa gccggatgga aataaagagt atatcaaaaa 
1981 agatgaagtc agacgtaaaa cggatggatt catggctttt gttcacgcat tatatagagc 
2041 agacgatata gtagacaaag acatgtctaa agcgcttgat gcattaatga gtatagattt 
2101 ctaatagagg aggtgagaca tgagtattct agaaaagata tttaaaacta ggaaagatat 
2161 aacatatatg cttgatttag atatgataga agatctatca caacaagcgt atgtgaaacg 
2221 tttagcgatt gatagttgta ttgaatttgt tgcgcgagct gtcgctcaaa gtcattttaa 
2281 agtattggaa ggtaatagaa ttcaaaagaa tgatgtttac tacaagttaa atataaaacc 
2341 aaatactgac ttatcaagcg atagtttttg gcaacaagtt atatataaac taatttatga 
2401 taacgaggtt ttaatcgtag taagtgacag caaagaatta cttatcgcag atagctttta 
2461 cagagaagag tacgctttgt atgatgatat attcaaagat gtaacggtta aagattatac 
.2521 ttatcaacgt actttcacaa tgcaagaggt catatattta aagtacaaca acaataaagt 
2581 gacacacttt gtagaaagtc tattcgaaga ttacgggaaa atattcggaa gaatgatagg 
2641 tgcacaatta aaaaactatc aaataagagg gattttgaaa tctgcctcta gcgcatatga 
2701 cgaaaagaat atagaaaaat tacaagcgtt cacaaataaa ttattcaata cttttaataa 
2761 aaatcaacta gcaatcgcgc ctttgataga aggttttgat tatgaggaat tatctaatgg 
2821 tggtaagaat agtaacatgc ctttttctga attgagtgag ctaatgagag atgcaataaa 
2881 aaatgttgcg ttgatgattg gtatacctcc aggtttgatt tacggagaaa cagctgattt 
2941 ggaaaaaaac acgcttgtat ttgagaagtt ctgtttaaca cctttattaa aaaagattca 
3001 gaacgaatta aacgcgaaac tcataacaca aagcatgtat ttgaaagata caagaataga 
3061 aattgtcggt gtgaataaaa aagacccact tcaatatgct gaagcaattg acaaacttgt 
3121 aagttctggt tcatttacaa ggaatgaggt gcggattatg ttaggtgaag aaccatcaga 
3181 caatcctgaa ttagacgaat acctgattac taaaaactac gaaaaagcta acagtggtga 
3241 aaatgatgaa aaagaaaaag atgaaaacac tttgaaaggt ggtgatgaag atgaaagcgg 
3301 agattaaagg cgtcatcgtt tccaacgaag ataaatgggt ttacgaaatg cttggtatgg 
3361 attcgacttg tcctaaagat gttttaacac aactagaatt tagtgatgaa gatgttgata 
3421 ttataattaa ctcaaatggt ggtaacctag tagctggtag tgaaatatat acacatttaa 
3481 gagctcataa aggcaaagtg aatgttcgta tcacagcaat agcagcaagt gcggcatcgc 
3541 ttatcgcaat ggctggtgac cacatcgaaa tgagtccggt tgctagaatg atgattcaca 
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3601 atccttcaag tattgcgcaa ggagaagtga aagatctaaa tcatgctgca gaaacattag 

3661 aacatgttgg Ccaaataatg gctgaggcat atgcggttag agctggtaaa aacaaacaag 

3721 aacttataga aatgatggct aaggaaacgt ggctaaatgc tgatgaagcc attgaacaag- 

3781 gttttgcgga tagtaaaatg tttgaaaacg acaatatgca aattgtagca agcgatacac 

3841 aagtgttatc gaaagatgta ttaaatcgtg taacagcttt ggtaagtaaa acgccagagg 

3901 ttaacattga tattgacgca atagcaaata aagtaattga aaaaataaat atgaaagaaa 

3961 aggaatcaga aatcgatgtt gcaga'tagta aattatcagc aaatggattt tcaagattcc 

4021 Uttttaata caaaaatagg aggtcataaa atgactataa atttatcgga aacattcgca 

4081 aatgcgaaaa acgaatttat taatgcagta aacaacggtg aaccgcaaga aagacaaaat 

4141 gaattgtacg gtgacatgat taaccaacta tttgaagaaa ctaaattaca agcaaaagca 

4201 gaagctgaaa gagtttctag tttacctaaa tcagcacaaa ctttgagtgc aaaccaaaga 

4261 aatttcttta tggatatcaa taagagtgtt ggatataaag aagaaaaact tttaccagaa 

4321 gaaacaattg atagaatctt cgaagattta acaacgaatc atccattatt agctgactta 

4381 ggtattaaaa atgctggttt gcgtttgaag ttcttaaaat ccgaaacttc tggcgtggct 

4441 gtttggggta aaatctatgg tgaaattaaa ggtcaattag atgctgcgtt cagtgaagaa 

4501 acagcaattc aaaataaatt gacagcgttt gttgttttac caaaagattt aaatgatttt 

4561 ggtcctgcgt ggattgaaag atttgttcgt gttcaaatcg aagaagcatt tgcagtggcg 

4621 cttgaaactg cgttcttaaa aggtactggt aaagaccaac cgattggctt aaaccgtcaa 

4681 gtacaaaaag gtgtatcggt aactgatggt gcttatccag agaaagaaga acaaggtacg 

4741 cttacatttg ctaatccgcg cgctacggtt aatgaattga cgcaagtgtc taaataccac 

4801 tcaactaacg agaaaggtaa atcagtagcg gttaaaggta atgtaacaat ggttgttaat 

4861 ccgtccgatg cttttgaggt tcaagcacag tatacacatt taaatgcaaa tggcgtatat 

4921 gttactgctt taccatttaa tttgaatgtt attgagtcta cagttcaaga agcaggtaag 

4981 gttttaacgt acgttaaagg tctatatgat ggttatttag ctggtggtat taatgttcag 

5041 aaatttaaag aaacacttgc gttagatgat atggatttat acactgcaaa acaatttgct 

5101 tacggcaaag cgaaagataa taaagttgct gctgtttgga aattagattt aaaaggacat 

5161 aaaccagctt tagaagatac cgaagaaaca ctataaaatt ttatgaggtg ataaaatggt 

5221 gaaatttaaa gttgttagag aatttaaaga catagagcac aatcaacaca agtacaaagt 

5281 aggggagttg tatccagctg aagggtataa caatcctcgt gttgaattgt tgacaaatca 

5341 aatcaaaaat aagtacgaca aagtttatat cgtaccttta gataagctga caaaacaaga 

5401 attattagaa ctatgcgaat cattacaaaa aaaagcgtct agttcaatgg ttaaaagtga 

5461 aatcatcgac ttattgaatg gtgaagacaa tgacgattga tgatttgctt gtcaaattta 

5521 aatcacttga aaagattgac cataattcag aggatgagta cttaaagcag ttgttaaaaa 

5581 tgtcgtacga gcgtataaaa aatcagtgcg gagtttttga attagagaat ttaataggtc 

5641 aagaattgat acttatacgc gctagatatg cttatcaaga tttattagaa cacttcaacg 

5701 acaattacag acctgaaata atagattttt cgttatctct aatggaggta tcagaagatg 

5761 aagaaagtgt ttaagaaacc tagaattaca actaaacgtt taaatacgcg tgttcatttt 

5821 tataagtata ctgaaaataa tggtccagaa gctggagaaa aagaagaaaa attattatat 

5881 agctgttggg cgagtattga tggtgtctgg ttacgtgaat tagaacaagc tatctcaaac 

5941 ggaacgcaaa atgacattaa attgtatatt cgtgatccgc aaggtgatta tttacccagt 

6001 gaagaacatt atcttgaaat tgaatcaaga tatttcaaaa atcgtttgaa tataaagcaa 

6061 gtatcaccag atttggataa taaagacttt attatgattc gcggaggata tagttcatga 

6121 gtgtgaaagt gacaggtgat aaagcattag aaagagaatt agaaaaacat tttggcataa 

6181 aagagatggt aaaagttcaa gataaggcgt taatagctgg tgctaaggta attgttgaag 

6241 aaataaaaaa acaactcaaa ccttcagaag actcaggagc actgattagt gagattggtc 

6301 gtactgaacc tgaatggata aaggggaaac gtactgttac aattaggtgg cgtgggcctt 

6361 ttgaacgatt tagaatagta catttaattg aaaatggtca tgttgagaaa aagtcaggaa 

6421 aatttgtaaa acctaaagct atgggtggga ttaatagagc aataagacaa gggcaaaata 

6481 agtattttga gacgctaaaa agggagttga aaaaattgtg attgatattt tgtacaaagt 
6541 tcatgaagtg attagtcaag acagaattat tagagagcac gtaaatatca ataatattaa 

6601 gttcaataaa taccctaatg taaaagatac tgatgtacct tttattgtta ttgacgatat 

6661 cgacgaccca atacctacaa cttatactga cggagatgag tgtgcatata gttatattgt 
6721 ccaaatagat gtttttgtta agtacaatga tgaatataat gcgagaatca taagaaataa 

6781 gatatctaat cgcattcaaa agttattatg gtctgaacta aaaatgggaa atgtttcaaa 
6841 tggaaaaccg gaatatatag aagaatttaa aacatataga agctctcgcg tttacgaggg 
6901 cattttttat aaggaggaaa attaaatggc agtaaaacat gcaagtgcgc caaaggcgta 
6961 tattaacatt actggtttag gtttcgctaa attaacgaaa gaaggcgcgg aattaaaata 
7021 tagtgatatt acaaaaacaa gaggattaca aaaaattggt gttgaaactg gtggagaact 
7081 aaaaacagct tatgctgatg gcggtccaat tgaatcaggg aatacagacg gagaaggtaa 
7141 aatctcatta caaatgcatg cgttccctaa agagattcgc aaaattgttt ttaatgaaga 
7201 ttatgatgaa gatggcgttt acgaagagaa acaaggtaaa caaaacaatt acgtagctgt 
7261 atggttcaga caagagcgta aagacggtac atttagaaca gttttattac ctaaagttat 
7321 gtttacaaat cctaaaatcg atggagaaac ggctgagaaa gattgggatt tctcaagtga 
7381 agaggttgaa ggtgaggcac ttttcccttt agttgataat aaaaagtcag tacgtaagta 
7441 tatctttgat tcagc.taaca tgacaaatca tgatggagac ggtgaaaaag gcgaagaggc 
7501 tttcttaaag aaaattttag gcgaagaata tactggaaac gtgacagagg gtaacgaaga 
7561 aactttgtaa caaaaccggc ttcatcggaa actgcggtaa agtcggttaa tataccagat 
7621 agcattaaaa cacttaaagt tggcgacaca tacgatttaa atgttgtagt agagccatct 
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7681 

7741 

7801 

7861 

7921 

7981 

8041 

8101 

8161 

8221 

8281 

8.341 

8401 

8461 

8521 

8581 

8641 

8701 

8761 

8821 

8881 

8941 

9001 

9061 

9121 

9181 

9241 

9301 

9361 

9421 

9481 

9541 

9601 

9661 

9721 

9781 

9841 

9901 

9961 

10021 

10081 

10141 

10201 

10261 

10321 

10381 

10441 

10501 

10561 

10621 

10681 

10741 

10801 

10861 

10921 

10981 

11041 

11101 

11161 

11221 

11281 

11341 

11401 

11461 

11521 

11581 

11641 

11701 



aatcaaagta 

gatggtcaag 

atgagtgaca 

ttgaaaataa 

agaagatcca 

atttgaaatt 

gatgaagcca 

ccaattcaca 

tcgtgaacaa 

ccagaacatg 

actctcatga 

tttcattatg 

gaggctttaa 

ttttagaaag 

tagatgcagc 

attctgactt 

acaaacaaag 

atttagccaa 

aaaagttacg 

tacaaaaaac 

tggcagaaag 

caaaaatggg 

ctgttttagg 

atactgttac 

ttaaagatgt 

aagttaatac 

tgaaattcag 

caatgggcga 

aagcggcgca 

gcgctccaat 

gggaaaagtc 

attggggtaa 

aaaagacgcc 

caggtcctga 

aaactattga 

ccgaaagatt 

ctattgaaag 

ttgattggtt 

ttgctgctgc 

atgcagtaac 

gttttttatc 

ttggcattgt 

aatctgaaac 

gtaattttat 

cgatatcagc 

atgaaaacgg 

tatttgaatt 

tgcaatttat 

gtgtaataca 

tcgttggtga 

aattaatttg 

actttggcgg 

tcagtaaatc 

atagcgtaaa 

tccgtacgaa 

ctaatttatg 

tttggaattc 

gtggaatttt 

atatcggcgg 

actgggtcgg 

acacacatac 

cagttgggga 

tccctaacgg 

gctcaaaagt 

ttagtttagg 

caaaagataa 

attttatgga 

tcaattcttt 



agttattgaa 

ttactgcgga 

ctataacaat 

ggagagtatt 

aaagcaaatg 

gtatacgaag 

agagaaatcg 

gttaaagacc 

gtgattttca 

aaataaagcc 

tggacttaat 

tgctttccat 

ttgatgcatt 

gaggtaaaaa 

aaatttaaat 

aaaattaaca 

gattaaagaa 

gcaatatgac 

acaagaatat 

atcagccgaa 

tggctgggga 

tgatggttta 

tattgcagca 

tcaagcaaca 

ttatggcaat 

aaggttaggt 

tcatataaca 

tgcaggtatc 

agctagtggg 

gagagctatg 

aggcgttaat 

agctggtaaa 

ggatatagct 

tttagcagac 

agattcccaa 

taaagtagca 

tgcgtttgct 

ttccaattta 

aattggtcct 

tgtattagct 

gactaaagta 

attaggtgta 

atttagaaat 

tcaatttatt 

aatagttgat 

aatttccatt 

tattttaaat 

ttggccggcg 

aggtgcttta 

ttggcgagga 

gaatttagtt 

gttgctaaaa 

tttatcagca 

atcaattttc 

tacaatagga 

gaatgcgacg 

cattaaagat 

cacaaatatg 

tatggtaagc 

tggtaagttg 

tactacaaga 

taagggacgc 

taaacgtgta 

atacaacggt 

tactatgtgg 

aataggtaaa 

aaatccaggc 

aactaaaggt 



atacacaaca 

agcacaaggc 

aaatgtagaa 

ataa'aatggc 

aaattaaatt 

caatggattt 

ctgacagatt 

taaaagaacg 

ttactcaagg 

tgaagattta 

tgaaaatggt 

atatcaaaat 

ttaaccttaa 

atgggagaaa 

agatcatttg 

ggcaacaact 

cttgatggaa 

aaggtatctc 

aacaaacaag 

tttgaagagt 

aaaaccagta 

aaatccattg 

gcatcaggaa 

ggcgcaacag 

tttccagcag 

tttacaggta 

ggttctgacg 

gaagcaagtg 

ataagtgttg 

ggctttgaga 

actgaaatag 

aacccaagag 

agcgcaacaa 

gctattaaag 

ggcacagtaa 

atgaataaat 

cccgtaatgg 

agtgatggtt 

gtagtttttg 

ccattgttag 

cctatattag 

ttggctggtt 

tttgttaatg 

caacctttcg 

ttcgcaaaag 

gttcaagcac 

tttgtaatta 

gttaaagcct 

aatatcatac 

gtttgggacg 

caattatggt 

ggattgatag 

atttggaatg 

acaaatatga 

aaagcgcagt 

aaagaaattt 

aatacggtag 

cgcgatggct 

gccattaaaa 

ggaatggata 

ttagttaaga 

ggaaatggtc 

atcacaccta 

gcacaaactt 

aaagatatta 

ggtaccaaat 

aaacttttaa 

atgggaattg 



gatcaaacga 

attgctacgg 

gcataagagg 

aaaattaaaa 

acaaacgtac 

aatcgatgat 

gatggatatg 

tatgcatgca 

tcaacaaact 

acatataaag 

aaagacgcta 

aaaaataatg 

ccgtttggtt 

gaataaaagg 

cagaaatcaa 

tcaaatatac 

ccatcacagg 

aagaacaggg 

caaatgagct 

tcaaaaaagc 

aagtttttga 

gtaaaggttt 

aagcttttgc 

gcagtgaatt 

atgctgaaac 

aagaacttga 

gtgtgcaagc 

aacatcaaag 

atacattagc 

tgaaagaatc 

cattcagtgg 

aagaatttaa 

gtttagcgat 

gtggtcgctt 

accaaacatt 

caaaattagt 

aagaattaat 

ctaaaagatc 

ggttaggtgc 

ctagtattgc 

gaactgtctt 

tagcagtcgc 

gtgcaattga 

ttgattctgt 

atatttggag 

ttcaaaatat 

aaccaattat 

tgattgtcag 

ttggcttgat 

ccgttgtgat 

ttgtaggtaa 

caggaatttg 

caacaaaaag 

aaaattggtt 

cattatttag 

ttagtaattt 

gaattgcaag 

tgagttccat 

aaggacttaa 

aaatacctaa 

acggtaagat 

caaatggttt 

atacagatac 

attcaatgtt 

aatctggtgc 

ggcttggcga 

attatatact 

caggcgacat 



atattgtatc 

ttaaagcaac 

gggcaacccc 

cgtaacatta 

ttaacaccac 

attgaggacg 

gtcgtaaaaa 

cctgatggaa 

gaggaaacta 

caatgttgaa 

acgaagtttt 

acatttctga 

agggttattt 

tttatctata 

acgaaacttt 

cgaaaaatca 

ttataagaaa 

cgaaaacagt 

gaattattta 

tcaagttgaa 

aagtatggga 

gatgattggt 

agaagttgat 

aaaaaaattg 

tgttggtgga 

aaatgccaca 

cgtacagtta 

tgttttggat 

tgatagtatt 

aattgcttta 

tttgaaaaaa 

gaagacatta 

tgaagcattt 

tagttatcaa 

taaagattct 

aggtgctgat 

caaaaagcta 

aattgttatt 

atttataagt 

aaaggctggt 

cacagcttta 

atttacaatt 

aagtgttaaa 

taaaaacatc 

tcaaatcaat 

atgcaacttt 

gttcgcgatt 

tacttgggag 

taagttcttc 

gattcttaaa 

aatacttggt 

ggacgtaata 

tatttttgga 

atctaatact 

tggcgtcaaa 

aagaaatcgg 

ccgtttatgg 

catagataag 

taaattaatc 

gttacacact 

tgcacgtgac 

tagaaatgaa 

taccgcttat 

aaacggaacg 

atcatcggca 

taaagttggc 

tgaagctttt 

aacaaaagct 



aatcaatagt 

agttggtaat 

tctattttat 

Ctcaattagt 

acttcatttc 

aaaatagcac 

tttacgataa 

tgaatgcact 

gaaattttat 

aaatatggat 

aaaaatgcca 

agaaaaagca 

ttttgaactt 

ggtttggatt 

aaaactttaa 

actgatagtt 

aacgttgatg 

gcagaagctc 

gaaagagaat 

gctcaaagaa 

cctaaattaa 

gtaactgcac 

aaaggtttag 

cagaactcat 

gttttaggag 

gagtcattct 

attacccgtg 

atggtagcaa 

actaaatacg 

ttctctcaat 

gctatatcaa 

gcagaaattg 

ggtgcaaagg 

gaatttttaa 

gaaagtggct 

gtatgggctt 

tctatagcgg 

ttcagtggta 

acaattggca 

ggattgatta 

actggtccaa 

gcttataaga 

caaacattta 

tttaaacaag 

ggattcttta 

attaaagcga 

tggcaagtga 

aacataaaag 

tcaagtttat 

ggagcagttc 

gttgttaggt 

agaagtatat 

tttttattta 

cggagcagta 

tcaaaattta 

atgtcaaata 

agtaaggtac 

attaaaagtc 

gacggtttaa 

ggtacagagc 

acattcgcta 

atgattgaat 

ttacctaaag 

cttccaagat 

tttaactgga 

gatgttttag 

ggaattgatt 

gcatggtcta 
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11761 
11821 
11881 
11941 
12001 
12061 
12121 
12181 
12241 
12301 
12361 
12421 
12481 
12541 
12601 
12661 
12721 
12781 
12841 
12901 
12961 
13021 
13081 
13141 
13201 
13261 
13321 
13381 
13441 
13501 
13561 
13621 
13681 
13741 
13801 
13861 
13921 
13981 
14041 
14101 
14161 
14221 
14281 
14341 
14401 
14461 
14521 
14581 
14641 
14701 
14761 
14821 
14881 
14941 
15001 
15061 
15121 
15181 
15241 
15301 
15361 
15421 
15481 
15541 
15601 
15661 
15721 
15781 



agattaagaa 
atttagtcgg 
cttataccgc 
aagaagttag 
atggtaatta 
actttagcaa 
ctggtaatac 
gacattttga 
gtggtggcgg 
cgcaaagtat 
ttgcaaaacg 
aaagaggaga 
ctaaacgtgg 
acattgttag 
caggtggaaa 
ttattccaac 
cagaagtaag 
acgggtttga 
ctttattact 
ttattgacga 
aagaatcaac 
taaagtgaac 
ttttaattat 
gcgtaggctt 
tcacaacggc 
cgaggaacaa 
aggaccaata 
actaacagac 
agtttcagtt 
taaaccatct 
tgatgaggta 
tgatttcaaa 
taaggtcggc 
tcctgatgca 
agattttcaa 
agcacaacat 
atatcatgat 
caaaaagata 
ttatatgcgg 
cattaaagac 
cggtaagttt 
ttataagtgg 
gaaaggcgca 
aaaaagtgtt 
tttcaatgtt 
gacggttaaa 
agattttaac 
gattcataaa 
aagagctgaa 
gcgtgaattt 
tatagcgtct 
gaaaaagaca 
tgaacaaacc 
tgaagtttta 
tagctctaat- 
aggtaaagaa 
agaaatcaaa 
gctagttgtg 
ggggatatat 
agccaaaaca 
tgatttggaa 
acatagagat 
cataatttca 
attacgagaa 
tagcaatatc 
caaaatacac 
tacaagtaac 
aacaccaaat 



aagtgctact 
cggaatatta 
tgcaactgga 
aacgccgatg 
tgtaaaaatt 
atcaccacct 
cggatttagt 
ccctgaacca 
tgctacttct 
tttaggtggt 
tgaaagtaac 
cccatcaaga 
atatactaac 
acgatatggt 
agtttttgat 
agatccagct 
agggaaaaaa 
tgatcctagc 
gaaaatagca 
atacgctttt 
aaaagtaaag 
aacaaaacaa 
gttttaaaaa 
gaatcttata 
attaaaacac 
gttaaattac 
aagctgcaca 
ccttacaaat 
gtaaatagtg 
agttacttta 
accaaagaag 
ggttggacta 
ggtgactttg 
aaaggttggg 
attacctata 
atttatgata 
agaaaaatag 
tacgactatc 
ctcagaagag 
ccagatagac 
tatcagcgtc 
atggagatga 
agggatgtca 
gtcatcaatg 
gattctgggt 
tggcaagata 
gacaagatta 
cgtaatgtta 
aagttccgtg 
attattaact 
tatcttgctg 
acttcagaag 
gaatacgatg 
aagcaattat 
accgtcaaag 
attgaatatg 
acagcattaa 
acagatgacg 
gaaccacaat 
gagttaaata 
gttacgtatc 
tttaacccgc 
gaaaatagca 
gagtttaaca 
aacactatag 
aaaagtgata 
cctgatgttg 
gatgttgaaa 



gattggataa 
gaccctgaca 
agaccatttc 
ggtggcagac 
actagtggcg 
agtggcacga 
acaggaccac 
tatttaagga 
ggaagtggcg 
cgttataaag 
taccagtcaa 
ggattattcc 
tttaataatc 
tggggtggtt 
ggttggtata 
cgtagaaatg 
gcgagtaaaa 
ttattattga 
caatctaacg 
gataaaaagg 
tttagaaaag 
ttccttggtt 
cagaaaatgt 
gttttgatat 
atgatgacgt 
aattcaaatc 
aagaatttac 
attcagtaac 
ggactgctga 
tgattactaa 
ttaaggatta 
agatgattac 
tgatatccaa 
ttggtgctgg 
aatgtattgt 
gtgatggtaa 
gacatattgt 
agaataaacc 
taggtaataa 
gtaaacctat 
cagcttctat 
atgggttagg 
ttatacaaaa 
aggaaccaat 
acagtgaatt 
gatatttata 
tagatttcct 
atgacaattc 
aacgacatcg 
gggttcaaga 
atataacaac 
cattgaaaga 
gcttacgtac 
gtacaaccta 
gtagatatgt 
gtaaagattt 
ttgctgtggg 
aagcgcaaag 
cagatgatca 
aacgtaagtc 
cgcacgagat 
cattgtatgt 
catatacatt 
agcgattgaa 
ttaaagatgt 
caccgccaga 
ctgtcttgcg 
aattaggtgg 



aagaaaattt 
aaattaatta 
atgaaggtgt 
ttacaagaat 
ctatcgatat 
tggtaaagcc 
atttacattt 
atgctaagaa 
caacttatgc 
gtaaatggat 
atgcagtgaa 
aaatcatcgg 
cagtacatca 
ttaaacgtgc 
acttaggtga 
atgcaatgaa 
ataagcgtcc 
aaatgattga 
atgtgattgc 
tgaacgcgtc 
gaggaattgc 
gtatgtcgaa 
agatggacgt 
acctttggtg 
cttgaatgaa 
taaagattgg 
aatacctgtt 
aggaaataaa 
cactccttta 
aaatgatgaa 
catgcctcct 
tgaagatatt 
tcttggcgaa 
cacgaaacga 
tgaacaaaaa 
gttacttgct 
tgttacgttg 
gataatgtat 
attttctatt 
tgatatggat 
catagctgtc 
ttcattcaat 
aggtgattta 
gttgagcgag 
aatcatacaa 
gaaaggagat 
ttctactgat 
agaaatgctt 
tgttattata 
tacgatggac 
agctaaaccg 
tgtgttgagc 
tacgtcatgg 
taaaatggtt 
agtactcaaa 
agtcgggtta 
acctgaaaat 
tcaattcaac 
aaatatgaat 
ggcagttatg 
tatatcaat.t 
agaggcagaa 
cggtcaacct 
cataatacat 
tgtagatggt 
aaatccagtc 
tagatattgg 
tataacaaga 



agaagctatg 
tcattatgga 
cgattttcca 
gccatttatg 
gctatttgcg 
cggtgatgtt 
tgaaatgagg 
aaaacjgaaga 
cagtcgagta 
tcatgaccaa 
taactgggat 
ctcaactttt 
aggtatctca 
tggtgattac 
agacggtcat 
gattttgcat 
tagccaatta 
acaacagcaa 
agataaagat 
tatagaaaag 
tattcaatga 
agagggtttg 
tcggggtcta 
gtacgtaatg 
ttagtaaagt 
tactggaacg 
aagttcacta 
aatactgcga 
attgttgaag 
gattatttta 
gtttatcata 
ccaagtaatg 
ggatataaag 
gggctcccta 
ggtaaaggtg 
tctattggtt 
tataaccaaa 
aacttggaca 
aaaacttgga 
gagaaagagt 
tatagtgcga 
acggagattc 
gtaaaaatag 
aaatcgtttg 
cctgaaaacg 
gagagtgtga 
gacccttcct 
gaactgctca 
agggattcaa 
ggctacacag 
tatgcaccag 
gatacaggtt 
acttcttatc 
ttagattttt 
aagaaaaaca 
actaggaaga 
gacaaaggga 
ctacctatgc 
gaaacacgat 
tcatatgaga 
ggcgatacag 
gttattgctg 
aaagagttca 
caaaagttaa 
gaattagaat 
aatgatatgc 
aatggtcgat 
gagaaagcgc 



ggcggtggcg 
cgtaccgcag 
tttgtatatc 
tctggtggtt 
catttgaaaa 
gttggtttaa 
agaaatggac 
ttatcaatag 
atccgacaag 
atgatgcgcg 
ataaatgctc 
agagcaaacg 
gcaatgcagt 
gcatatgcta 
ccagaatgga 
tatgcagcag 
tcagacttaa 
caacaaatag 
tatcagccga 
cgagaaaggc 
tagacactat 
aaataccctc 
tatataaagg 
actatttatc 
tttttaacta 
cttatttcga 
tcaaagtagt 
tttcagacca 
cccgagcaat 
tggttggtga 
gtgagtttcg 
acttaggtgg 
caactaattt 
aagcgatgac . 
ccggaagaac 
atgaaaataa 
aaggagaccc 
gaatcgttgt 
aatttgatca 
ggatagatgg 
agtataacgg 
taccgaaacc 
atatgcaagc 
gaagtaatta 
tctttgatac 
tacatgtttt 
tagttagagc 
tatcatcaga 
acaaacaatg 
agatagaatg 
gcaaatttga 
gggaagtttc 
aaactagata 
atattgagct 
gcttattcaa 
ttgatatgtc 
agcgtttaga 
gctatatttg 
taagttcttt 
ttacttctac 
tcagagtaaa 
aagaatataa 
aagaatcaga 
acgataatat 
actttgaacg 
ttcggtatga 
ggattgaagc 
tattcagtga 
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15841 attaaacaat atttttatta atttatctat 

15901 agaattaccg aatagcgagt acttagtaga 

15961 tttagacgct gtgatcgatg tttataatca 

16021 cgaaactgca acgattggtc ggttggtaga 

16081 gaaattacaa gatgtttata cagatgtaga 

16141 taaattatta cagtcacaat acactgatga 

16201 aacaaaattt ggtttaacgg tgaatgaaga 

16261 taaatcagct attgaagcag ctagagaatc 

16321 aacatcggac tataaaacag acaaagacgg 

16381 tgagagaacg actttaaaag gtgaaatcaa 

16441 cggattggaa gaacaaaaac aatatactga 

16501 tgagattaaa gcaagtattg aacaagcaaa 

16561 cattgatgct caagatgatc ttaaagagaa 

16621 ttcggaagaa gagcaacgcg ctatacaaga 

16681 aaacgcagaa ctaaaggcta gaaacgctga 

16741 ggtcaaagaa agcacagatg cacagaggaa 

16801 acaaaatggt aaggaaatca aattaagaac 

16861 tacactttca aatatattaa acgagattgt 

16921 atatgatgat aacggagtgg ctcaagcttt 

16981 tgctgataaa attgatatta acggtaatag 

17041 agataaagta gataaaaccg atattgtcaa 

17101 tatcaatgtt aatagaattg gaattaaagg 

17161 gaatgattct attgaactag gtggtattgt 

17221 agacgatatt tttacgcgac tgaaagacgg 

17281 cggttcactt tatatgtcac attttggtat 

17341 cggtggttca tctggtacga ttcaatggtg 

17401 tggtataaca atcaattcct atggtggtgt 

17461 tgttctggag tcttacgctt catcgaatat 

17521 tccaaacaca gacaaagtgc ctggattaaa 

17581 taatgcttat tcgagtgacg gttatattat 

17641 tgcgggtatc aggttttcta aagaaagaaa 

17701 atatgcaaca ggtggagata caacaatcga 

17761 acgacgtgat ggtaataggt atattcatat 

17821 agatgatgca ggagatagga tagcttctaa 

17881 agctaatttg catattactt ctgctggcac 

17941 caagttatct atcgaaaatc aatataacga 

18001 tattcttaac ttacctatta gaacgtggtt 

18061 agagctgaga gaagatagaa aattatcgga 

18121 tttgattgct gaagaggtgg agaatttagg 

18181 aggagaaatt gaaggtatag cgtatgatcg 

18241 agaacaacaa ctaagaatca agaaattgga 

18301 gattacaagc taatcctgaa tatacaattc 

18361 cacaagaaaa cgcgatgtta aaagcgtata 

18421 ctgaggaaga gtaatcctta gcactatttt 

18481 atggcaaaag aaattatcaa caatacagaa 

18541 ggtacagaac gtgtagtata tcaagatttc 

18601 aaccatgctc aagattttaa atctgaagaa 

18661 ttgttatatc aattaactaa caaaaaacaa 

18721 agatcagatt tatctccaga ggtaacagtt 

18781 tagatactca tagtctttat tcttttagaa 

18841 aaacacgaac atgaatggcg catcagaagg 

18901 acactcaacg aaattaaatt aggtcaaaaa 

18961 aaaaccttag atgctattca aaaagaaaga 

19021 gataagaaca tacgtgatat gaaaatgtgg 

19081 tcgctaatta tagcattatt gcgtatgctt 

19141 tcggattaaa ttttggagct tcgctgtgga 

19201 agttaagagt cagtgcttcg gcactggctt 

19261 gatgcaaaag taataacaag atacatcgta 

19321 gcgaacaaag gtattagccc aattccagta 

19381 actgtagtcg ctttatatac aacgtataaa 

19441 gcaaatcaaa aattaaagaa atataaagct 

19501 gcgccaatta aagaagtaat gacacctacg 

19561 gtggttgata tatgctaatg acaaaaaatc 

19621 ggaaacaatt caacccagat ggttggtatg 

19681 tctttatgtt agcgacaggc gaaaggctgc 

19741 ataataaagc aaagattgaa aaatatggtc 

19801 cgcaaaagtt ggatattgtc gttttcccgt 

19861 aaattgttga gagcgcaaat ttaaatactt 



acaacacgct agtcttttgt cagaagctac 
taatgatttg aaagcggact tacaagcaag 
aattaaaaat aatttagaat ctatgacacc 
tacacaagct ttatttcttg agtatagaaa 
agatgtcaaa atcgccattt cagatagatt 
aaaatataaa gaagcgttgg aaataatagc 
tttgcagtta gtcggagaac ctaatgttgt 
cacaaaagaa caattacgtg actatgtaaa 
tattgttgaa cgtttagata ctgctgaagc 
agataaagct acgttaaacg aatatcgaaa 
tgaccagtta agtgatttgt ccaataatcc 
tcaagaagcg caagaagctt taaaatcata 
ggaatcgcaa gcgtatgctg atggtaaaat 
tgctcaagct aaacttgaag aggcaaaaca 
aaagaaagct aatgcttata cagacaacaa 
aacattgact cgctatggtt ctcaaattat 
tactaaagaa gagtttaatg caaccaatcg 
tcaaaatgtt acagatggaa caacaatcag 
gaatgtgggg ccacgtggta ttagattaaa 
agaaataaac cttcttatcc aaaatatgcg 
cagtcttaat ttatcaagag agggtcttga 
cggtgacaat aacagatatg ttcaaataca 
gcaacgtact tggagaggga aacgttcaac 
tcacctaaga tttagaaata acaccgctgg 
ttcgacttat attgatggtg aaggtgaaga 
ggataaaact tacagtgata gtggcatgaa 
cgttgcacta acgtcagata ataatcgggt 
caaaagcaaa caggcaccgg tgtatttata 
ccgatttgca ttcacgctgt ctaatgcaga 
gtttggttct gatgagaact atgattacgg 
taaaggtctt gttcaaattg ttaatggacg 
agcagggtat ggcaaattta atatgctgaa 
acagagtaca gacctactgt ctgtaggttc 
ctcaatttac agacgtactt attcggccgc 
aattgggcgt tcgacatcag cgcgtaaata 
tagagatgaa caactggaac attcaaaagc 
tgataaagct gagtctgaaa ttttagctag 
agacacctat aaacttgata gatacgtagg 
attaaaagag tttgtcacgt atgatgacaa 
tctatggatt catcttatcc ctgttatcaa 
ggagtcaaag aatgcaggat aacaaacaag 
attatttatc acaggaaatt atgaggttaa 
tacaagaaaa taaagaaaat caacaatgtg 
tacacaaaaa tttaaggagg tcatttaatt 
aggtttattt tagtacaaat cgacaaagaa 
acaggaagtt ttacaacttc tgaaatggtt 
aacgctaaga aaattgcgga gacgttaaat 
cgtgtgaaag tagttaaaga agtagttgaa 
aacactgaaa cagtatgaaa agctatgagt 
agcgggtgta ctgaattggg gtggttcaaa 
ttagaagaga atgataaaac aatgctcagc 
acccaagagc aagttaacat caaattagat 
gaaatagatg aaaagaataa gaaagaaaat 
gtgcttggtt tagttgggac aatatttggg 
atgggcatat aagagaggtg attaccatgt 
cgtgtttctg gtttggtaag tgtaagtaat 
tttattttgg ataaaaggag caaacaaatg 
ttgatcttag cattagtaaa tcaattctta 
gacgatgaaa ctatatcatc aataatactt 
gacaatccaa cacctcaaga aggtaaatgg 
gaaaataagt atagaaaagc aacagggcaa 
aatatgaacg acacaaatga tttagggtag 
aagcagaaaa atggtttgac aattcattag 
gatttcagtg ttatgattac gccaatatgt 
aaggtttata tgcttataat atcccgtttg 
aaataattaa aaactatgac agctttttac 
caaagtatgg tggcggagcc ggacacgttg 
tcacatcatt tggtcaaaac tggaacggta 
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19921 aaggttggac taatggcgtt gcgcaacctg 

19981 ttcattatta tgacaatcca atgtatttta 

20O41 ttggcaataa agctaaaggt attattaagc 

20101 aacctaaaaa aattatgctt gtagccggtc 

20161 acggaacaaa cgaacgcgat tttatacgta 

20221 taagacatgc aggacatgaa gttgcattat 

20281 atcaagatac tgcatacggt gttaatgtag 

20341 ttaaatcaca ggggtatgac attgttctag 

20401 caagtggtgg gcatgttatt ~atctcaagtc 

20461 tacaagatgt tattaaaaat aacttaggac 

20521 tactaaatgt taatgtatca gcagaaataa 

20581 ttattactaa taaaaatgat atggattgga 

20641 taatagccgg tgcgattcat ggtaagccta 

2 0701 catcagctaa aaacaaaaaa aatccaccag 

20761 atgtccctta taaaaaagaa caaggcaatt 

20821 taagagacgg ttattcaact aattcaagaa 

20881 ttacgtatga cggtgcatat tgtattaatg 

20941 gtggacaacg tcgttatata gcgacaggag 

21001 gttttggtaa gtttagcacg atttagtatt 

21061 tatagggaat cttacagtta ttaaataact 

21121 tttttaacat ttctctcaag atttaaatgt 

21181 tattttttta tgttatagct agccttcggg 

21241 catcaactat ttacatctat ccttgttcac 

21301 gatagagagc atagttttca tactactccc 

21361 taacagttta cggggtgctt ttatgttata 

21421 tagccgggca gaggccatgt atctgactgt 

21481 cactcgatac atatatctta acaacataga 

21541 tcgatacggt tatatttatt cccctacaac 

21601 attgtggtta ttttttgcgt ttttttgggg 

21661 caaacgcttg tggaaaagct aaaaggttaa 

21721 tttggacgct cgtgtacgtt agagaatgac 

21781 cttgtgttaa aaagccttta atatcagttg 

21841 aaaaaagggc agaaaaaggg cagatacctt 

21901 taactctctg tccattttct ctgttacatg 

21961 tgtatgtcct actcttttca taattgcttt 

22021 tatgtgtgta tgccttagtg tgtgagtagt 

22081 agctgaggac aatcgtttgt ttatcctact 

22141 gaatataaac cctctatcaa catagcttgg 

22201 cattattttt ttcaatacat ttgctatcct 

22261 tgcggtctta gtagtatctt tgtgaccaaa 

22321 gccattaata gcgatcgttt tatttttgag 

22381 ctcacctatg cgcatacctg ttaaagcttg 

22441 agctctatac tgcatgttat tatcgttcag 

22501 catctctaaa tagttataca ttttcgcttc 

22561 cttctttggt agtgtgacgc tatttaatat 

22621 ggcgtattta atagcttctt tcatatgtcc 

22681 tacgtttgat aatttgttaa taaatgtttg 

22741 taaattttga gaactgttct ttttgatgtt 

22801 cgttacttta aagccagatg tttttatatg 

22861 aaaagtcaaa gtttttaatt cgcttgacga 

22921 ttctaaacga aacattgcct ctttttgcga 

22981 tacacgtttc catttatctg tatacggatc 

23041 ttcattgttc ttatttttaa atttttcaaa 

23101 aaaaaataat aagggtaggc gggctaccca 

23161 aaaatacaga cgccacttat aattataaga 

23221 aatatatacg tgttttaaag gataaacctt 

23281 agggatctgc aatatattat tattaattct 

23341 tattactgga tttttaattt tttggggtaa 

23401 ctggaaagaa tttatgcaag cgtaactatt 

23461 tgatactatg ttattaatgt ttctgtcaat 

23521 atcagatata aattcaataa aataatcttt 

23581 tttttcatcg aaaacttctt ttaatatagc 

23641 aaacaatctc aaataatact cccatttcaa 

23701 ttctttagag gataagggaa taacatttac 

23761 catcactatt gcaaagtgtg aattagaaaa 

23821 aaaaactatt tctccttgtt taaactttgg 

23881 aaatctcttg agtaaatagt gaatatctga 

23941 agtttttaat ttattaatgc gtttttctat 



gttggggtcc tgaaactgtg acaagacatg 
ttaggttaaa cttccctaac aacttaagcg 
aagcgactac aaaaaaagag gcagtaatta 
atggttataa cgatcctgga gcagtaggaa 
aatatataac gcctaatatc gctaagtatt 
acggtggctc aagtcaatca caagatatgt 
gcaataaaaa agattatggc ttatattggg 
aaatacattt agacgcagca ggagaaagcg 
aattcaatgc agatactatt gataaaagta 
aaataagagg tgtgacacct cgtaatgatt 
atataaatta tcgtttatct gaattaggtt 
ttaagaaaaa ctatgacttg tattctaaat 
taggtggttt ggtagctggt aatgttaaaa 
tgccagcagg ttatacactc gataagaata 
acacagtagc taatgttaaa ggtaataatg 
ttacaggggt attacccaac aacacaacaa 
gttatagatg gattacttat attgctaata 
aggtagacaa ggcaggtaat agaataagta 
tacttagaat aaaaattttg ctacattaat 
atttggatgg atgttaatat tcctatacac 
agataacagg caggtacttc ggtacttgcc 
ctagtttttt gttatgatgt gttacacatg 
ccaagcatgt cactggatgt tttttcttgc 
cgtagtatat atgactttag cattcccgta 
attgctttta tatagtagga gtgaactata 
tggtcccaca ggagacatct tccttgtcat 
aatgttacat tcgctataac cgtatcttaa 
caacaaaacc acagatccta ttaatttagg 
caaaaaaagg gcagattatt tgaaaaaggg 
aaatgacaaa aaccttgata caacagtgtt 
cggtttacca tcatacaagg gtgggattaa 
ttacaaagga tttgtagcgt ctttaaaaat 
ttagtacaca agtttttcta atttttgctc 
tgtatacacc tttatagtcg ttttttcatc 
taacgatata ttcatttccg ccaataaact 
aactttttta tttatattta atgattctgc 
gccttgcata ggatttcctt ggcaagttgt 
ttcccattgt tgcatctttt tattctctaa 
tgaattgatg gcgatttttc ttcttgaacc 
tccagcatta catttgattc tgtgaatagt 
gtcaacatct ttaacttgga gagctaataa 
aacttctaca gccccagcaa ctaaaatacg 
tataaaatcg cgtatctgta ttacctgttc 
ttctttttct atatcttcta tcgtcttact 
gtgttcgttt ggataattgt aaaatttaac 
aagttgacgc tttacctgat ttgcagaata 
catgtacttt gtatcaattt tgtttaaaag 
tttgattctt gttttcaaat tatcaagcgt 
atattcaagc cattcatcta ataacgcgtg 
cttgttgttt agtttttctt ttattttttc 
ttgctttgta ctcttattca agacaacact 
tttgtatttc tcgtagtatc tatacttcgt 
ccacatttta catccctcct caaaattggc 
tgaaaattgt ataaaaaaag acgcctgtat 
ttacatggtt aattaccaaa aatggtaacg 
taatatatta aaattatatc atcttatatc 
atttatcagt aacataatat ccgaagaatc 
aacttttctt atgcgaaact tactaatcgg 
accttttaat ttttttacct tatcaattgc 
tttatttaat ttatcttcaa Cttctaaact 
agtgatgaat tccgtgttgt ttttttggta 
tgaattattt tgcgcgctaa tcaaatttaa 
atcaaaattc atctttaaat actttttgtt 
tatatcctcc gtaccagaat catttttatt 
ttctttatta acgtttatac cgaaatctac 
ataaaaacct ttatggtttt tctcaccctc 
atctaacttt ttaaattttg gatttccaga 
attatgcgtc atcatttctc ctttattctc 
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24001 gctcacactc tcaccaccat tcaacgtcta cacttgtagg cgttttttga ttagtaaaat 

24061 cataatgaat cttctttggt taacttatcg ccatctattt tttgtgaaat aaattccaag 

24121 tatttacgcg cattatgtga cgataaatct ttaggtaact cataagtgaa tggttgatta 

24181 ccactagtta aaacttcata tactatagtt tcttttttta ttttgcaatt agttattttc 

24241 attataaact ccttttaaac actgctgaaa tagacgtctt tttcaaataa gcatgattaa 

24301 tactttaatt ctttaatcca catatattta aaagtgaggt agtaggtaat aaatataaga 

24361 cttaaagtta agattgcttt tttcatgtca atttctcctt tgtttatatt tatattaaag 

24421 cgctaaatat acgttattaa tcacaataca actttgccca ttactttaat atcactaaac 

24481 gaagcgactt tgatatcatc atacttcgga . tttagagata ccaaattaat atagtcttcg 

24541 catatatcta cacgcttgat aagacttact ccatctaata caacgagtgc aattgtacca 

24601 tctttaatag aatcttcttt cttaataaaa gcgtatgt-tc cttgttttaa cataggttcc 

24661 attgaatcac cattaactaa aatacaaaaa tcagcatttg atggcgtttc gtcttcttta 

24721 aaaaatactt cttcatgcaa tatgtcatca tataattctt ctcctatgcc agcaccagtt 

24781 gcaccacatg caatatacga tactagttta gactctttat attcatctat agaagtgact 

24841 ttattctgtt catctaattg ctcatttgca tagttaagta cgttttcttg gcggggaggt 

24901 gtgagttgag aaaatatgtt attgattttt gacattatcg tttcatcttg acgttcttcg 

24961 tcaggaactc gataagaatc tacatcatac cccataagcc acgcttcacc gacatttaaa 

25021 gttttagata ataagaataa tttatgttgg tctggagaag accttccatt aacatactgg 

25081 gataagtgac tttttgacat tttaatattc aattcttttt gaaagggttt cgacttttct 

25141 agaatatcta cttgacgcaa gttcctatct ttcataattt gttttaatct ttcagaagtg 

25201 ttttgcattg gtaatgcctc cttgaaattc attatatagg aagggaaata aaaatcaata 

25261 caaaagttca acttttttaa ctttttgtgt tgacattgtt caaaattggg gttatagtta 

25321 ttatagttca aatgtttgaa cttaggaggt gattatttga atactaatac aacttttgat 

25381 ttttcgttat tgaacggtaa gatagtcgaa gtgtactcga cacaatttaa ctttgctata 

2 5441 gctttaggtg tatcagaaag aactttgtct ttgaagttga acaacaaagt accatggaaa 

25501 acaacagaca ttattaaagc ttgtaagtta ttgggaatac ctataaaaga tgttcacaaa 

25561 tattttttta aacagaaagt tcaaatgttt gaacttaata agtaaaggag gcataacaca 

25621 tgcaagaacg agaaaaggtt aataaaagta acacatcttc aaatgaagca tcaaaacctt 

2 5681 ttaggacaaa ttgaagctta cgacaaaacg cttaaagaaa taaagtacac tcgagacctt 

25741 tacaacaaac acctaagcat gaacaacgaa gacgcattcg ctggtttgga aatggtagag 

25801 gatgaaatta ctaaaaagct acgaagtgct atcaaagagt tccaaaaagt agtgaaagcg 

2 5861 ttagacaagc ttaacggtgt tgaaagcgat aacaaagtta ctgatttaac agagtggcgg 

25921 aaagtgaatc agtaacattc acttcttaat ataaccacgc ttatcaacat ccacattgag 

25981 cagatgtgag cgagagctgg cgatgatatg agccgcgttt aaatacattc gatagtcatt 

26041 gcgataaccg tctgctgaat gtgggtgttg aggaaaaagg aggatactca aatgcaagca 

26101 ttacaaacat ttaattttaa agagctacca gtaagaacag tagaaattga aaacgaacct 

26161 tattttgtag gaaaagatat tgctgagatt ttaggatatg caagatcaaa caatgccatt 

26221 agaaatcatg ttgatagcga ggacaagctg acgcaccaat ctagtgcatc aggtcaaaac 

26281 agaaatatga tcattatcaa cgaatcagga ttatacagtc taatcttcga tgcttctaaa 

26341 caaagcaaaa acgaaaaaat tagagaaacc gctagaaaat tcaaacgctg ggtaacatca 

26401 gatgtcctac cagctattcg caaacacggt atatacgcaa cagacaatgt aattgaacaa 

26461 acattaaaag atccagacta catcattaca gtgttgactg agtataagaa agaaaaagag 

26521 caaaacttac ttttacaaca gcaagtagaa gttaacaaac caaaagtatt attcgctgac 

26581 tcggtagctg gtagtgataa ttcaatactt gttggagaac tagcgaaaat acttaaacaa 

26641 aacggtgttg atataggaca aaacagattg ttcaaatggt taagaaataa tggatatctc 

26701 attaaaaaga gtggagaaag ttataactta ccaactcaaa agagtatgga tctaaaaatc 

26761 ttggatatca aaaaacgaat aattaataat ccagatggtt caagtaaagt atcacgtaca 

26821 ccaaaagtaa caggcaaagg acaacaatac tttgttaata agtttttagg agaaaaacaa 

26881 acatcttaaa aggaggaaca caatggaaca aatcacatta accaaagaag agttgaaaga 

26941 aattatagca aaagaagtta gagaggctat aaatggcaag aaaccaatca gttcaggttc 

27001 aattttcagt aaagtaagaa tcaataatga cgatttagaa gaaatcaata aaaaactcaa 

27061 tttcgcaaaa gatttgtcgc taggaagatt gaggaagctc aatcatccga ttccgctaaa 

27121 aaagtatcag catggcttcg aatcaattca tcaaaaagct tatgtacaag atgttcatga 

27181 ccatattaga aaattaacat tatcaatttt tggagtgaca cttaattcag acttgagtga 

27241 aagtgaatac aacctagcag caaaagttta tcgagaaatc aaaaactatt atttatacat 

27301 ctatgaaaag agagtttcag aattaactat cgatgatttc gaataaagga ggaacaacaa 

27361 atgttacaaa aatttagaat tgcgaaagaa aaaaataaat taaaactcaa attactcaag 

27421 catgctagtt actgtttaga aagaaacaac aaccctgaac tgttgcgagc agttgcagag 

27481 ttgttgaaaa aggttagcta aattcaacgg taaggatttg ccctgcctcc acacttagag 

27541 tttgagatcc aacaaacaca taagttttag tagggtctag aaaaaatgtt tcgatttcct 

27601 cttttgtaac agtttcaatt ccttcatatc ctggaaaaac aattttcttt aaatccgaaa 

27661 catgtttttt tgaaccatcc tttaaagtaa ctagaagttt catacttatc acctccttag 

27721 gttgataaca acattataca cgaaaggagc ataaacaata tgcaagcatt acaaacaaat 

27781 tcgaacatcg gagaaatgtt caatattcaa gaaaaagaaa atggagaaat cgcaatcagc 

27841 ggtcgagaac ttcatcaagc attagaagtt aagacagcat ataaagattg gtttccaaga 

27901 atgcttaaat acggatttga agaaaataca gattacacag ctatcgctca aaaaagagca 

27961 acagctcaag gcaatatgac tcactatatt gaccacgcac tcacactaga cactgcaaaa 

28021 gaaatcgcaa tgattcaacg tagtgaacct ggcaaacgtg caagacaata tttcatccaa 
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28081 gttgaaaaag catggaacag cccagaaatg 

28141 aacacaatca atcaattaga aacaaagatt 

28201 gatgcagtag ctactactaa gacatcaatt 

28261 caaaacggta taaacatcgg gcaacgcaga 

28321 cttattaaac gcaagggtgt ggattataac 

28381 ttattcgaaa ttaaagaaac atcaatcaca 

28441 acgccaaaag taacaggtaa aggacaacaa 

28501 caaacaactt aataggagga attacaaatg 

28561 acaatggcag ttgtgacgtg gaaggtttgg 

28621 attagtagca gggcgttgag tgactatcta 

28681 gctgaaaatt ctactgaatc tgctcgtcgc 

28741 aaataacaac attatacacg aaaggaaaga 

28801 caccagaaaa cacatataga ggcgaagaaa 

28861 cacaaatcca tcaattgttt ggagtatgta 

28921 accgcaaaga taatttaggt gtagaaaatt 

28981 tgattaatat ttctaaattg gaagagtatt 

29041 aggatattaa atgagcaaca tttataaaag 

29101 cttagcgatt gtacttatgc cgtttctata 

29161 cgcaagtatc gcaacattca tgtactacaa 

29221 ctacttgttg gagcaagtaa cagtatcaaa 

29281 aacgaaaaac ggaggaagtc aagatgtatt 

29341 ttcatgttaa cggattcgat tttaagctat 

29401 tacaagttaa agatatgaac aacgtaccaa 

29461 acttagatat ggcatcagac ttatttaacc 

2 9521 cagacgaaca ggacagacta attaacttag 

29581 agactgtaac ttatatcatt cgtcataggg 

29641 ctgataacaa ttcagatatt agttactcca 

2 9701 gtatggaaga agcgagtatc aatatggatt 

2 9761 aaactattga gtacgaggag gtagaacatg 
29821 gtaagcatac tcaaaaaact aaagataaat 
29881 tataaatttg cagtatacgg aaaaattggc 
29941 aaagacgctt tcgtcattga cattaacgaa 

3 0001 gacgtagaaa tcgagaacta tcaacacttt 
3 0061 ttacaggaga tgagagaaaa cggacaagaa 
30121 aaacttagag atatgacatt gaatgatgtg 
3 0181 aatgattggg gagaagttgc tgaacgaatt 
30241 caagaagaat acaaattcca ctttgttatt 
3 0301 gatgatgaag gtagcactat caaccctact 
30361 aaagctatta cttctcaaag tgatgtgtta 
3 0421 aacggagaaa agaaagctag atatattcta 
30481 aagattagac attcaccttc aataacaatt 
3 0541 acggacgtag tagaagcaat tagaaatgga 
3 0601 aattatgaaa atcacaggac aagcgcaatt 
30661 taacggctca gcagggtttc aagctggaga 
30721 caatgataga gaaaatagat atttcacaat 
30781 taaacataat caatttgtac cgccgtataa 
30841 attagttact cgattaggta ttaagttaaa 
30901 tcttattggt aagttttgtc acttggtatt 
30961 gtattttacg gatttttcat ttattaaacc 
31021 acctattccg aagacagata agcaaaaagc 
31081 atcaatgtct caacaaagca atccatttga 
31141 agatttagcg ttttaaggtg tggtttaaat 
31201 cgacggtact tattccgtcg ttgctactgg 
31261 actagaaaac ggatatccac taaaagcaga 
313 21 tatagaacaa cgcaaaaaaa tattcgcaat 
31381 accagtagaa tcaactagaa aattattaca 
31441 agaaatcagt ctgcgcgact gttctatgaa 
31501 agcgtttatg tttcatcatc aaatacctat 
31561 agataaagcg ttattatatt gggctacaat 
31621 tcacgcagac ctggcacatt atgaagcagt 
31681 ccactatgac aaacatgtat tagcgttatg 
31741 tggcgttaag tcgtttgatg ataaatacca 
31801 gaggctcaat aaaatgttga aaggagagaa 
31861 atagcactcc taatcgtcat cttggcggaa 
31921 gtggagaaaa ttttaaaatc tccgtttagt 
31981 aggcggacaa actaattgag ccttttttga 
32041 ttaatacttc aaattcaatg ccagaaagtt 
32101 ttaacattct tttaacaaat tctaatcccg 



attatgcaac gtgctttaaa aattgctaac 
gcacgtgaca aaccaaaaat tgtatttgca 
ttagttggag agttagcaaa gatcattaaa 
ttgtttgagt ggttacgtca aaacggattc 
atgcctacac agtattcaat ggaacgtgag 
cattcggacg gtcacacatc aattagtaag 
tactttgtta acaagttttt aggagaaaaa 
aacgcactat acaaaacaac cctcctcatc 
aagattgaga agcacactag aaaacctgtg 
aacaacaaat ctttaaccat accgaaagat 
cttttgaagt tcgccgaaca aactattagc 
tagaaatgcc aaaaatcata gtaccaccaa 
aatttgtgaa aaagttatac gcaacaccta 
gaagtacagt atacaactgg ttgaaatatt 
tatacattga ttattcacca acaggcactc 
tgatcagaaa gcataaaaaa tggtattagg 
ctacctagta gcagtattat . gcttcacagt 
cttcactaca gcatggtcaa ttgcgggatt 
agaatgcttt ttcaaagaat aaaaaaactg 
cacttaagaa aaaattcatg ttcaatataa 
acgaaatagg cgaaatcata cgcaaaaata 
tcattttaaa aggtcatatg ggcatatcaa 
ttaaacatgc ttatgtcgta gatgagaatg 
aagcaataga tgaatggatt gaagagaaca 
tcatgaaatg gtaggaggtc gctatgaagc 
atatgccaat ttatataact aacaaaccaa 
caaatagaaa tagagctagg gagtttaacg 
atcacaaagc aatcaagaaa acagtgacag 
actgaggaaa aacaagaacc acaagaaaaa 
aatatcgctg agaaaaataa aaggaaattc 
tcaggaaaaa ccacgtttgc tacaagagat 
ggtggaacaa cggttactga cgaaggatca 
gtttatgttg taaatttttt acctcaaatt 
atcaatgttg tagttattga aactattcaa 
atgaaaaata agtctaaaaa accaacgttt 
gtcagtatgt acagattaat aggaaaactt 
acaggtcatg aaggtatcaa caaagataaa 
atcactattg aagcgcaaga acaaattaaa 
gctagggcaa tgattgaaga atttgatgat 
aacgctgaac cttctaatac gtttgaaaca 
aacaataaga aatttgcaaa tcctagcatt 
aactaaaaat taattaaaag gacggtattt 
tactaaagaa acaaatcaag aaaagtttta 
attcacagtg aaagttaaaa atattgaatt 
cgtatttgaa aatgatgaag gcaaacaata 
atatgatttc caagaaaaac aattgattga 
tcttcctagc ttagattttg ataccaatga 
gaaatggaaa ttcaatgaag atgaaggtaa 
ttacaaaaag ggcgatgatg ttgttaacaa 
tgaagaaaat aacggggcac aacaacaaac 
aagcagtggc caatttggat atgacgacca 
gcaatacatt acaagatacc agaaagataa 
tgttgaactt gaacaaagtc acattgactt 
agtagaggtt ccggacaata aaaaactatc 
gtgtagagat atagaacttc actggggcga 
aacagaattg gaaattatga aaggttatga 
agttgcaagg gagttaatag aactgattat 
gagtgtagaa acgagtaagt tgttaagcga 
caaccgcaac tgtgtaatat gcggaaagcc 
cggcagaggc atgaacagaa acaaaatgaa 
tcgcgaacat cacaacgagc aacatgcgat 
cttgcatgac tcgtggataa aagttgatga 
aaaggaatga atagactaag aataataaaa 
gagattagaa atgctatgca tgctgtaaaa 
taatacaggt ttttacaaaa gctttaccat 
tgtctattac ccaggggctg taatgtaact 
tacttattgt ttctaggttg tgtcctgact 
aaacaaatct ttgtttttct ataatcttat 



166 



32161 taaagtgatt taaaaactga ggagcataaa 

32221 aagacatgtc aaaagtttca tttaaaaccc 

32281 cggttgattc tatatctaac ggagagtctt 

32341 cattctttgg gtttaaaacc gctctatatt 

32401 aatgttttaa aagaatagca tcatttgggg 

32461 ggtgggttaa tgagtttttt ctgtcatcca 

32521 tacttaaagt tttttcacta atgtaaaact 

32581 aaaattgtgg ttcttgtaaa ttatttttag 

32641 ctttgaattt ttcaaattct acttctcttt 

32701 atttcccaaa gacaagttcc caagttttag 

32761 cttcaataat tttatcaata cctttaccta 

32821 ctaacgcaat agcgataata aaattatacc 

32881 aagttactac tcaataatta cagcaaatgt 

32941 aaagttactt tttgcagaaa taacatcttt 

33001 taatggttac tttgcaactt tiatacaacgt 

33061 gaaccttacc aactttggtt atctaaaaat 

33121 acaaaggaag atgtacccct tgacgcaaac 

33181 ccctattgat aattctgtca atacccctat 

33241 tattaataat acaagtaata acaatataaa 

33301 agcatcttct ataccctata aagaaattat 

33361 ttttaaacac aatacagcta aaacaaaaga 

33421 taggttggag gattttaaaa aggtgattga 

33481 tagcgataaa taccttagac cagaaacact 

33541 tcaaaaaata caaccaactg gcacggatca 

33601 ttgggattag ggggatatta tgaaaccact 

33661 aaaatatcaa cctactcatg tcgaaaaagg 

33721 cgacttatat aagtttgctc ctactaaaaa 

33781 ttgcaaatgt gaaatctatg aggaatataa 

33841 attcaatcaa tcaaacgtta atccgtcttt 

33901 acaaaatgaa aaacaagtac acgctaaaca 

33961 tacaaaagaa ccaaaatcat taatattgca 

34021 agcatacgct atcgcaaaag cagtcaaagc 

34081 accaatgttg atggatcgta tcaaagcgac 

34141 cgagctagtc agattgctaa gtgatattga 

34201 aaacacagag cacactttaa ataaactttt 

34261 caacatcttt acaactaact ttagtgataa 

34321 tataaattcg agaatgaaaa aaagagcaag 

343 81 ggagcgagat gcatggtaac caaagaattt 

3 4441 tacgctcaga aactcataga tgaggcacag 

34S01 atccaaaaac ttgcagaacg tcatacacgc 

34561 aaaatgccga aagaaaaata ttacttatac 

34621 atcaagtata aagacaacgt aaatgaggtt 

34681 gaaaagaaaa ttatgactga tagtgaccta 

34741 tatgagcaag aattaggttt acaagcaacg 

3 4801 aaatacaacg ctaagaaagt tgagtacaaa 

34861 gaatattacc aatatttaga aagtaatatg 

34921 caaccgaaat tcgaattatt accaaaacta 

34981 gacttcgcgt tatatctcga tggcaaactg 

35041 accgaagtag caaaacttaa agctaagatt 

35101 aattggatat gtaaagcgcc taagtataca 

35161 attaaagcaa gacgagaacg caaaagagaa 

35221 tataaatgca acgattgata taaggatacc 

35281 tgtggataaa gaaaaagaag cgctggcaga 

35341 agagtatgac aatttaaaaa ttagaaacgt 

35401 tgtaatcatt aataataaac catataaatt 

35461 agcgtgggat aaatgctgga attgtttcta 

35521 aagctttaga cgcgccttat ggcatgcacc 

35581 aaaagattaa acaagcgaga ctcgaacgtg 

35641 agctacgtaa gaagaagcca catttgttta 

35701 actggttcga tgtcacttat aaccaaatgt 

35761 aatcagtaac agaaaagtag atatgaacaa 

35821 ttacacatac ggcgacattg aaattataga 

35881 accacaatta gcattcgcaa taggtaatgc 

35941 gaatggtcat gaggatttag caaaggcgaa 

36001- ggagtgatga ccatgacaga tagcggacgt 

36061 aagagatatc tgtatcagga taacgaacga 

36121 tattactttc acggtcatat cgtgccaggt 

36181 gcggaagagc ttgaaacata tataaagcaa 



acttattata aattcctttt tttgttaagt 
ctaaccttac taggttatta attgaaattt 
ttattaacgt gtccgatata ttcataccgt 
taacggcagg atgtacttcg tgattcttta 
ataattgttt aattatttca acaaatgaat 
tagatgatgc tattagtttt gcgaacatat 
ttgaagcttc tagagcagga cctagaagag 
gtacagaaga tatttctttt ttaaattgtt 
gataaataac tttatccaca taaaggtgga 
agaatgtttc tacaggccct tttgatgcgc 
aaataggatc cataattatt cacccccaat 
agaaaggaga atcaacatga ctgaccaacc 
cagatacgat aaccgactta ctgacagcga 
aagtaacaaa tacggatact gcacagcaag 
tgttaaggaa actatatctc gtagaatttc 
cgaaattatc aaagaaggta atgaagttaa 
gtcaatacct attgacgcaa aaatcaatac 
tgacgcaaat gtcaaagaga atattacaag 
tagaatagat atattgtcgg gcaacccgac 
cgattactta aacaaaaaag cgggcaagca 
ttttattaaa gcaagatgga atcaagattt 
tatcaaaaca gctgagtggc taaacacgga 
ttttggcagt aaatttgagg ggtacctcaa 
attggaacgc atgaagtacg acgaaagtta 
attcagcgaa aagataaacg aaagcttgaa 
attgaaatgt gagagatgtg gaagtgaata 
acacccgaat ggttacgagt ataaagacgg 
gcgaaacaag caacggaaga taaacaacat 
aagagatgca acagtcaaaa actacaagcc 
aacagcaata gagtacgtac aaggcttctc 
aggttcatac ggaactggta aaagccacct 
taaagggcat acggttgctt ttatgcacat 
atacaacaaa aatgcagtag agactacaga 
tttacttgta ctagatgata tgggtgtaga 
cagcattgtt gataacagag taggtaaaaa 
agaactaaat caaaatatga actggcaacg 
aaaagtaaga gtaatcggag acgatttcag 
ttaaaaacta aacttgagtg ttcagatatg 
ggcgatgaaa ataggttgta cgacctattt 
cccgctatcg tcgaatacta aggagtgtta 
cgagaagatg gcacagaaga tattaaggtc 
tattcgctca caggagccca tttcagcgac 
aaacgattca aaggcgctca cgggcttcta 
atatttgata tttagaggtg gacgatgagt 
ggaattgtat ttgatagcaa agtagagtgt 
aatggcacta attatgatca tatcgaaata 
gataaacaac gaaagattga atatattgca 
attgaagtta tcgacattaa aggtatgcca 
ttcagacata aatacagaaa cataaaactc 
ggtaaaacat ggattacgta cgaggaatta 
atgaagtgat ctaatgcaac aacaagcata 
tacagaagtt gaatatcagc attttgatga 
ttacttatat aacaatcctg acgaaatact 
aaatgtagag gtggaataaa tgggcagtgt 
taacaatttt gaaaaaagaa ataatggcaa 
aacgtgttag aggttgttgg gagttttcag 
taaaagaata tagagaaatg aaacaaatgg 
aattggaaag agagcgaaag aaagaggctg 
atgtacctca aaaacattca cgtgatccgt 
tcaagaaatg gagtgaagca taatgagcat 
aacgcaagac aacgttaagc aacctgcgca 
ttttattgaa caagttacgg cacagtaccc 
aattaaatac ttgtctagag caccgttaaa 
gttttacgtc gatagagtat ttgacttgtg 
aaagaatact taaaacattt tttcggctct 
gtggcacata tccatgtagt aaatggca.ct 
tggcaaggtg tgaaaaagac atttgataca 
agtgatttgg aatatgagga acagaagcaa 
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36241 ctaactttat tt.taaaaggg cggaaacaat gaaaatcaaa attgaaaaag aaatgaattt 

36301 acctgaactt atccaatggg cttgggataa ccccaagtta tcaggtaata aaagattcta 

36361 ttcaaatgat gttgagcgca actgttttgt gacttttcat gttgatagca tcttatgtaa 

36421 tgtgactgga tatgtatcaa ttaacgataa atttactgtt caagaggaga tataacaatg 

36481 aaaatcaaag ttaaaaaaga aatgagatta gatgaattaa ttaaatgggc gcgagaaaat 

36541 ccggatctat cacaaggaaa aatatttttt tcaacaggat ttagtgatgg attcgttcgt 

36601 tttcatccaa atacaaataa gtgttcgacg tcaagtttta ttccaattga tatccccttc 

36661 atagttgata- ttgaaaaaga agtaacggaa gagactaagg ttgataggtt gattgaatta 

36721 ttcgagattc aagaaggaga ctataactct acactatatg agaacactag tataaaagaa 

36781 tgtttatatg gcagatgtgt gcctaccaaa gcattctaca tcttaaacga tgacctaact 

36841 atgacgttaa tctggaaaga tggggagttg ctagtatgat gttgaaattt aaagcttggg 

36901 ataaagataa aaaagttatg agtattattg acgaaatcga ttttaatagt gggtacattt 

36961 tgatttcaac aggttataaa agtttcaatg aagtaaaact attacaatac acaggattta 

37021 aagatgtgca cggtgtggag atttatgaag gggatattgt tcaagattgt tattcgagag 

37081 aagtaagttt tatcgagttt aaagaaggag ccttttatat aacttttagc aatgtaactg 

37141 aattactaag tgaaaatgac gatattattg aaattgttgg aaatattttt gaaaatgaga 

37201 tgctattgga ggttatgaga tgacgttcac cttatcagat gaacaatata aaaatctttg 

37261 tactaactct aacaagttat tagataaact tcacaaagca ttaaaagatc gtgaagagta 

37321 caagaagcaa cgagatgagc ttattgggga tatagcgaag ttacgagatt gtaacaaaga 

3 73 81 tctagagaag aaagcaagcg catgggatag gtattgcaag agcgttgaaa aagatttaat 

37441 aaacgaattc ggtaacgatg atgaaagagt taaattcgga atggaattaa acaataaaat 

37501 ttttatggag gatgacacaa atgaataatc gcgaaaaaat cgaacagtcc gttattagtg 

37561 ctagtgcgta taacggtaat gacacagagg ggttgctaaa agagattgag gacgtgtata 

37621 agaaagcgca agcgtttgat gaaatacttg agggaatgac aaatgctatt caacattcag 

37681 ttaaagaagg tattgaactt gatgaagcag tagggattat ggcaggtcaa gttgtctata 

37741 aatatgagga ggaataggaa aatgactaac acattacaag taaaactatt atcaaaaaat 

37801 gctagaatgc ccgaacgaaa tcataagacg gatgcaggtt atgacatatt ctcagctgaa 

37861 actgtcgtac tcgaaccaca agaaaaagca gtgatcaaaa cagatgtagc tgtgagtata 

37921 ccagagggct atgtcggact attaactagt cgtagtggtg taagtagtaa aacgtattta 

37981 gtgattgaaa caggcaagat agacgcggga tatcatggca atttagggat taatatcaag 

38041 aatgatgaag aacgtgatgg aatacccttt ttatatgatg atatagacgc tgaattagaa 

38101 gatggattaa taagcatttt agatataaaa ggtaactatg tacaagatgg aagaggcata 

38161 agaagagttt accaaatcaa caaaggcgat aaactagctc aattggttat cgtgcctata 

38221 tggacaccgg aactaaagca agtggaggaa ttcgaaagtg tttcagaacg tggagcaaaa 

38281 ggcttcggaa gtagcggagt gtaaagacat cttagatcga gttaaggagg ttttggggaa 

38341 gtgacgcaat acttagtcac aacattcaaa gattcaacag gacgaccaca tgaacatatt 

38401 actgtggcta gagataatca gacgtttaca gttattgagg cagagagtaa agaagaagcg 

38461 aaagagaagt acgaggcaca agttaaaaga gatgcagtta ttaaagtggg tcagttgtat 

38521 gaaaatataa gggagtgtgg gaaatgacgg atgttaaaat taaaactatt tcaggtggag 

38581 tttattttgt aaaaacagct gaaccttttg aaaaatatgt tgaaagaatg acgagtttta 

38641 atggttatat ttacgcaagt actataatca agaaaccaac gtatattaaa acagatacga 

38701 ttgaatcaat cacacttatt gaggagcatg ggaaatgaat cagctgagaa ttttattaca 

38761 tgacggtagt agtttgatat tacatgaaga tgaattattt aacgaaatag tatttgtttt 

38821 ggacaatttt agaaatgatg atgactattt aacgatagaa aaagattatg gcagagaact 

38881 tgtattgaac aaaggttata tagttgggat caatgttgag gaggcagatg atgattaaca 

38941 tacctaaaat gaaattcccg aaaaagtaca ctgaaataat caaaaaatat aaaaataaag 

39001 cacctgaaga aaaggctaag attgaagatg attttattaa agaaattaaa gataaagaca 

39061 gtgaatttta cagtcctacg atggctaata tgaatgaata tgaattaagg gctatgttaa 

39121 gaatgatgcc tagtttaatt gatactggag atgacaatga tgattaaaaa acttaaaaat 

39181 atggatgggt tcgacatctt tattgttgga atactgtcat tattcggtat attcgcattg 

39241 ctacttgtta tcacattgcc tatctataca gtggctagtt accaacacaa agaattacat 

39301 caaggaacta ttacagataa atataacaag agacaagata aagaagacaa gttctatatt 

39361 gtattagaca acaaacaagt cattgaaaat tccgacttat tattcaaaaa gaaatttgat 

39421 agcgcagata tacaagctag gttaaaagta ggcgataagg tagaagttaa aacaatcggt 

39481 tatagaatac actttttaaa tttatatccg gtcttatacg aagtaaagaa ggtagataaa 

39541 caatgattaa acaaatacta agactattat tcttactagc aatgtatgag ttaggtaagt 

39601 atgtaactga gcaagtgtat attatgatga cggctaatga tgatgtagag gcgccgagtg 

39661 attacgtctt tcgagcggag gtgagtgaat aatgagaata tttatttatg atttgatcgt 

39721 tttgctgttt gctttcttaa tatccatata tattattgat gatggagtga taataaatgc 

39781 attaggaatt tttggtatgt ataaaattat agattccttt tcagaaaata ttataaagag 

39841 gtagataaaa atgaacgagc aaataatagg aagcatatat actttagcag gaggtgttgt 

39901 gctttattca gttaaagaga tttttaggta ttttacagat tctaacttac aacgtaaaaa 

39961 aatcaattta gaacaaatat atccgatata tttagattgt tttaaaaagg ctaaaaagat 

40021 gattggagct tatattattc caacagaaca gcatgaattt ttagattttt ttgatattga 

40081 agtctttaat aatttagata agcaaagtaa aaaagcgtat gaaaatgtta ttggatttag 

40141 acaaatgatt aatttatcaa atagagttaa ggcaatggaa gattttaaga tgagtttcaa 

40201 caatgaattt agtacaaatc agattttttt taatccttct tttgttatgg aaacaattgc 

40261 tattataaat gaatatcaaa aagatatatc ttatttaaaa aatataatta ataaaatgaa 
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40321 tgaaaataga gcttataatc atattgatag ttttatcact tcagagtacc gacgaaaaat 

40381 aaacgattat aatctttatc ttgataaatt tgaagaacag tttagtcaaa agtttaaaat 

40441 aaacagaact tcgataaaag aaagaattat tattaattta aacaagagga gatttaaatg 

40501 atgtggatta ctatgactat tgtatttgct atattgctat tagtttgtat cagtattaat 

40561 agtgatcgtg caagagagat acaagcactt agatatatga atgattatct acttgatgaa 

40621 gtagttaaaa ctaaagggta caacgggtta gaagaataca ggattgaatt gaagcgaatg 

40681 aataacgata ttaaaaagta atttatatta tcggaggtat tgcattgaat gataaagatt 

40741 gagaaacacg atatcaaaaa gcttgaagaa tacattcagc acatcgataa ctatcgaaga 

40801 gagttgaaga tgcgagaata tgaattactt gaaagtcatg aaccagataa tgcgggagct 

40861 ggcaaaagta atttgccggg taacccgatt gaacgatgtg caataaagaa gtttagtgat 

4 0921 aacaggtaca atacattaag aaatatagtt aacggtgbag atagattgat aggtgaaagt 

40981 gatgaggata cgcttgagtt attaaggttt agatattggg attgtcctat tggttgttat 

41041 gaatgggaag atatagcaca ttactttggt acaagtaaga caagtatatt acgtagaagg 

41101 aatgcactga tcgataagtt agcaaagtat attggttatg tgtagcggac ttttacccta 

41161 tgtaagtccg cattaaaaca gtttattatg ttagtatcag attaatattt aaagttatta 

41221 aatgctaata cgacgcatga acaagaggcg catcactatg tgatgtgtct ttttatttat 

41281 gaggtatgaa catgttcaaa ctaattgtaa atacattact acacatcaag tatagatgag 

41341 tcttgatact acttaagtta tataaggtga aacattatga tgactaaaga cgaacgtata 

41401 cgattctata agtctaaaga atggcaaata acaagaaaaa gagtgctaga aagagataat 

41461 tatgaatgtc aacaatgtaa gagagacggc aagttaacga catatgacaa aagcaagcgt 

41521 aagtcgttgg atgtagatca tatattatcg ctagaacatc atccggagtt tgctcatgac 

41581 ttaaacaatt tagaaacact gtgtattaaa tgtcacaaca aaaaagaaaa gagatttata 

41641 aaaaaagaaa ataaatggaa agacgaaaaa tggtaaatac ccccgggtca aaaaaatcaa 

41701 aagcgatc 
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Table 4 



77ORF017 sequence 

23982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 

1M THNIEK RINKLKTS 

23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 

16 GNPKFKKLDSDIHYL 

23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 

31 LKRFEGEKNH KGFYP 

2384 7 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 

46 KFK QGE IV FVDFGI N 

23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 

61 VNKEFSNSH FA IVMN 

23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 

76 KNDSNTEDIVN VI PL 

23712 tcctctaaagaaaacaaaaagtatttaaagatgaattttgatttg 

91 SSKENKK Y.LKMNFDL 

23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 

106 KWE YYLRLFLN L ISA 

23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 

121 QNNSAI LKEVFDKKY 

23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 

136 Q KNN T E F I T KDY F I E 

23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 

151 F I SDSLEIENKLNKI 

23487 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 

166 D R N I N N I VSA IDKVK 

23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 

181 KLKGN SYACINS FQP 

23397 attagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 

196 I SKFRI RK VLPQKI K 

233 52 aatccagtaatagattcttcggatattatgttactgataaataga 

211 NPVIDSSDIMLLINR 

23307 attaataataatatattgcagatccctgatataagatga 23269 

226 I NNN I LQ I PD I R * 
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Physico-chemical parameters of ORF 77ORF017 

1 MTHNIEKRIN KLKTSGNPKF KKLDSDIHYL LKRFEGEKNH KGFYPKFKQG EIVFVDFGIN 

61 VNKEFSNSHF AIVMNKNDSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 

121 QNNSAILKEV FDKKYQKNNT EFITKDYFIE FISDSLEIEN KLNKIDRNIN NIVSAIDKVK 

181 KLKGNSYACI NSFQPISKFR IRKVLPQKIK NPVIDSSDIM LLINRINNNI LQIPDIR 



Number of amino acids: 

Average molecular weight (Daltons): 

Mean amino acid weight (Daltons): 

Monoisotopic molecular weight (Daltons): 

Mean amino acid monoisotopic weight (Daltons): 

Amino acid composition 



237 

27887.38 
117.67 
27869.83 
117.59 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


5 


2.11% 


7.58% 


Cys 


C 


1 


0.42% 


1.66% 


Asp 


D 


14 


5.91% 


5.28% 


Glu 


E 


13 


5.49% 


6.37% 


Phe 


F 


16 


6.75% 


4.09% 


Gly 


G 


6 


2.53% 


6.84% 


His 


H 


4 


1.69% 


2.24% 


He 


I 


29 


12.24 
% 


5.81% 


Lys 


K 


33 


13.92 
% 


5.95% 


Leu 


L 


19 


8.02% 


9.42% 


Met 


M 


4 


1.69% 


2.37% 


Asn 


N 


30 


12.66 
% 


4.45%. 


Pro 


P 


7 


2.95% 


4.9% 


Gin 


Q 


6 


2.53% 


3.97% 


Arg 


R 


8 


3.38% 


5.16% 


Ser 


s 


17 


7.17% 


7,12% 


Thr 


T 


5 


2.11% 


5.67% 


Val 


V 


11 


4.64% 


6.58% 


Trp 


W 


1 


0.42% 


1.23% 


Tyr 


Y 


8 


3.38% 


3.18% 



Number of acidic (negative) amino acids (ED): 

Number of basic (positive) amino acids (KR): 

Total charge (KRED): 

Net charge (KR- ED): 

Theoritical pi: 

Total linear charge density: 

Average hydrophobicity: 

Ratio of hydrophilicity to hydrophobicity: 

Percentage of hydrophilic amino acid: 

Percentage of hydrophobic amino acid: 

Ratio of %hydrophilic to %hydrophobic: 



27 

11.39% 
41 

17.30% 
68 

28.69% 
14 

5.91% 

10.01 

0.30 

-5.37 

1.41 

57.81% 

42.19% 

1.37 
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77ORF019 sequence 

3 9851 atgaacgagcaaataataggaagcatatatactttagcaggaggt 

1 MNEQIIG-SIYTL AGG 

3 9896 gttgtgctttattcagttaaagagatttttaggtattttacagat 

16 VVLYSVKEIFRYF TD 

3 9941 tctaacttacaacgtaaaaaaatcaa,tttagaacaaatatatccg 

31 SNLQRKKINLE Q IYP 

3 9986 atatatttagattgttttaaaaaggctaaaaagatgattggagct 
46 I YLDC FKKAKKM I GA 
40031 . tatattattccaacagaacagcatgaatttttagatttttttgat 
61 Y I I PTEQHEFLDFF D 
40076 attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 
76 I EVFNNLDKQS KKAY 
40121 gaaaatgttattggatttagacaaatgattaatttatcaaataga 
91 ENVIGFR .QMINLSNR 

4 0166 gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 
106 V KA M E D F K M S F NN E F^ 
40211 agtacaaatcagattttttttaatccttcttttgttatggaaaca 
121 S T N Q I F FNPS FVM ET 
40256 attgctattataaatgaatatcaaaaagatatatcttatttaaaa 
136 I ■ A I INEYQKD I SYLK 
40301 aatataattaataaaatgaatgaaaatagagcttataatcatatt 
.151 NI I NKMNENRAYNH I 
40346 gatagttttatcacttcagagtaccgacgaaaaataaacgattat 
166 D S F I TS EY RRK I NDY 
40391 aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 
181 N L Y L D K F E E Q F S Q K F 
40436 aaaataaacagaacttcgataaaagaaagaattattattaattta 
196 KINRTSIKERI I INL 
40481 aacaagaggagatttaaatga 40501 

211 N K R R F K * 
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Physico-chemical parameters of ORF 77ORF019 

1 MNEQI IGSIY TLAGGWLYS VKEIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIGA 

61 YIIPTEQHEF LDFFDIEVFN NLDKQSKKAY ENVIGFRQMI NLSNRVKAME DFKMSFKNEF 

121 STNQIFFNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 

181 NLYLDKFEEQ FSQKFKINRT SIKERIIINL NKRRFK 



Number of amino acids: e 216 

Average molecular weight (Daltons): 26026.06 

Mean amino acid weight (Daltons): 120.49 

Monoisotopic molecular weight (Daltons): 26009.34 

Mean amino acid monoisotopic weight (Daltons): 120.41 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


7 ' 


3.24% 


7.58% 


Cys 


C 


1 


0.46% 


1.66% 


Asp 


D 


10 


4.63% 


5.28% 


Glu 


E 


16 


7.41% 


6.37% 


Phe 


F 


19 


8.80% 


4.09% 


Gly 


G 


5 


2.31% 


6.84% 


His 


H 


2 


0.93% 


2.24% 


He 


I 


28 


12.96 
% 


5.81% 


Lys 


K 


22 


10.19 
% 


5.95% 


Leu 


L 


12 


5.56% 


9.42% 


Met 


M 


7 


3.24% 


2.37% 


Asn 


N 


23 


10.65 

% 


4.45% 


Pro 


P 


3 


1.39% 


4.9% 


Gin 


Q 


10 


4.63% 


3.97% 


Arg 


R 


11 


5.09% 


5.16% 


Ser 


s 


13 


6.02% 


7.12% 


Thr 


T 


7 


3.24% 


5.67% 


Val 


v 


7 


3.24% 


6.58% 


Trp 


W 


0 


0.00% 


1.23% 


Tyr 


Y 


13 


6.02% 


3.18% 



Number, of acidic (negative) amino acids (ED): 


26 


12.04% 


Number of basic (positive) amino acids (KR): 


33 




15.28% 


Total charge (KRED): 


59 




27.31% 


Net charge (KR-ED): 


7 




3.24% 


Theoritical pi: 


9.52 


Total linear charge density: 


0.28 


Average hydrophobicity: 


-4.84 


Ratio of hydrophilicity to hydrophobicity: 


1.37 


Percentage of hydrophilic amino acid: 


54.17% 


Percentage of hydrophobic amino acid: 


45.83% 


Ratio of %hydrophilic to %hydrophobic: 


1.18 
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77ORF043 sequence 

29304 atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 

1 MYYE I GE I I R KN I HV 

2934 9 aacggattcgattttaagctattcattttaaaaggt.catatgggc 

16 NGFDF KLF I L KGHMG 

293 94 atatcaatacaagttaaagatatgaacaacgtaccaattaaacat 

31 I S IQVKDMNNVP I KH 

2943 9 gcttatgtcgtagatgagaatgacttagatatggcatcagactta 

46 AYVVDENDLDMASDL 

29484 tttaaccaagcaatagatgaatggattgaagagaacacagacgaa 

61 FN QAIDEWIEENTD E 

29529 caggacagactaattaacttagtcatgaaatggtag 29564 

76 QDRLI NLVMKW* 
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Physico-chemical parameters of ORF 77ORF043 

1 MYYEIGEIIR KNIHVNGFDF KLFILKGHMG ISIQVKDMNN VPIKHAYWD ENDLDMASDL 

61 FNQAIDEWIE ENTDEQDRLI NLVMKW 



Number of amino acids: 

Average molecular weight (Daltons): 

Mean amino acid weight (Daltons): 

Monoisotopic molecular weight (Daltons): 

Mean amino acid monoisotopic weight (Daltons): 



86 

10186.68 
118.45 
10180.02 
118.37 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.49% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


10.47 

% 


5.28% 


Glu 


E 


7 


8.14% 


6.37% 


Phe 


F 


4 


4.65% 


4.09% 


Gly 


G 


4 


4.65% 


6.84% 


His 


H 


3 


3.49% 


2.24% 


He 


I 


11 


12.79 

% 


5.81% 


Lys 


K 


6 


6.98% 


5.95% 


Leu 


L 


6 


6.98% 


9.42% 


Met 


M 


5 


5.81% 


2.37% 


Asn 


N 


8 


9.30% 


4.45% 


Pro 


P 


1 


1.16% 


4.9% 


Gin 


Q 


3 


3.49% 


3.97% 


Arg 


R 


2 


2.33% 


5.16% 


Ser 


s 


2 


2.33% 


7.12% 


Thr 


T 


1 


1.16% 


5.67% 


Val 


V 


6 


6.98% 


6.58% 


Trp 


W 


2 


2.33% 


1.23% 


Tyr 


Y 


3 


3.49% 


3.18% 



Number of acidic (negative) amino acids (ED): 
Number of basic (positive) amino acids (KR): 
Total charge (KRED): 

Net charge (KR - ED): 

9.30% 

Theoritical pi: 

Total linear charge density: 

Average hydrophobicity: 

Ratio of hydrophilicity to hydrophobicity: 

Percentage of hydrophilic amino acid: 

Percentage of hydrophobic amino acid: 

Ratio of %hydrophilic to %hydrophobic: 



16 

18.60% 
8 

9.30% 
24 

27.91% 
-8 

4.38 

0.30 

-2.80 

1.19 

48.84% 

51.16% 

0.95 
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77ORF102 sequence 



c. y \j z> J. 




atgagcaacatttataaaagctacctagtagcagtattatgcttc 


1 M 


S 


NIYKS YLVAVLC F 


29096 




acagtcttagcgattgtacttatgccgtttctatacttcactaca 


16 T 


V 


LAIVLMPFLYFTT 


29141 




gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 


31 A 


W 


SIAGFASI ATFMY 


29186 




tacaaagaatgctttttcaaagaataa 2 9212 


46 Y 


K 


E C F F K E * 
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Physico-chemical parameters of ORF 77ORF102 

1 MSNIYKSYLV AVLCFTVLAI VLMPFLYFTT AWSIAGFASI ATFMYYKECF FKE 



Number of amino acids: 53 

Average molecular weight (Daltons): 6155.42 

Mean amino acid weight (Daltons): 1 16.14 

Monoisotopic molecular weight (Daltons): * 6151.07 

Mean amino acid monoisotopic weight (Daltons): 1 16.06 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


6 


11.32 

% 


7.58% 


Cys 


C 


2 


3.77 
% 


1.66% 


Asp 


D 


0 


0.00% 


5.28% 


Glu 


E 


2 


3.77 
% 


6.37% 


Phe 


F 


7 


13.21 

% 


4.09% 


Gly 


G 


1 


1.89 
% 


6.84% 


His 


H 


0 


0.00% 


2.24% 


He 


I 


4 


7.55 
% 


5.81% 


Lys 


K 


3 


5.66% 


5.95% 


Leu 


L 


5 


9.43 
% 


9.42% 


Met 


M 


3 


5.66% 


2.37% 


Asn 


N 


1 


1.89 

% 


4.45% 


Pro 


P 


1 


1.89% 


4.9% 


Gin 


Q 


0 


0.00 
% 


3.97% 


Arg 


R 


6 


0.00% 


5.16% 


Ser 


s 


4 


7.55 
% 


7.12% 


Thr 


T 


4 


7.55% 


5.67% 


Val 


V 


4 


7.55 
% 


6.58% 


Trp 


W 


1 


1.89% 


1.23% 


Tyr 


Y 


5 


9.43 

% 


3.18% 



Number of acidic (negative) amino acids (ED): 


2 


3.77% 


Number of basic (positive) amino acids (KR): 


3 


5.66% 


Total charge (KRED): 


5 


9.43% 


Net charge (KR - ED): 


1 


1.89% 


Theoritical pi: 


8.18 


Total linear charge density: 


0.13 


Average hydrophobicity: 


10.81 


Ratio of hydrophilicity to hydrophobicity: 


0.40 


Percentage of hydrophilic amino acid: 


•28.30% 


Percentage of hydrophobic amino acid: 


71.70% 



Ratio of %hydrophilic to %hydrophobic: 
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77ORF104 sequence 



34393 




atggtaaccaaagaatttttaaaaactaaacttgagtgttcagat 


1 M 


V 


TKEFL KTKLECS D 


34438 




atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 


16 M 


Y 


AQKLIDEA QGDEN 


34483 




aggttgtacgacctatttatccaaaaacttgcagaacgtcataca 


31 R 


L 


YDLFIQKLAERHT 


34528 




cgccccgctatcgtcgaatattaa 34551 


46 R 


P 


A I V E Y * 
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Physico-chemical parameters of ORF 77ORF104 

1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY 



Number of amino acids: 

Average molecular weight (Daltons): 

Mean amino acid weight (Daltons): 

Monoisotopic molecular weight (Daltons): 

Mean amino acid monoisotopic weight (Daltons): 

Amino acid composition. 



52 

6193.13 
119.10 
6189.12 
119.02 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


4 


7.69 
% 


7.58% ! 


Cys 


C 


1 


1.92% 


1.66% 


Asp 


D 


4 


7.69 
% 


5.28% 


Glu 


E 


6 


11.54 

% 


6.37% 


Phe 


F 


2 


3.85 
% 


4.09% 


Gly 


G 


1 


1.92% 


6.84% 


His 


H 


1 


1.92 
% 


2.24% 


He 


I 


3 


5.77% 


5.81% 


Lys 


K 


5 


9.62 
% 


5.95% 


Leu 


L 


6 


11.54 

% 


9.42% 


Met 


M 


2 


3.85 
% 


2.37% 


Asn 


N 


1 ! 


1.92% 


4.45% 


Pro 


P 


1 


1.92 
% 


4.9% 


Gin 


Q 


3 


5.77% 


3.97% 


Arg 


R 


3 


5.77 
% 


5.16% 


Ser 


s 


1 


1.92% 


7.12% 


Thr 


T 


3 


5.77 
% 


5.67% 


Val 


V 


2 


3.85% 


6.58% 


Trp 


W 


0 


0.00 
% 


1.23% 


Tyr 


Y 


3 


5.77% 


3.18% 



Number of acidic (negative) amino acids (ED): 
Number of basic (positive) amino acids (KR): 
Total charge (KRED): 

Net charge (KR- ED): 

3.85% 

Theoritical pi: 

Total linear charge density: 

Average hydrophobicity: 

Ratio of hydrophilicity to hydrophobicity: 

Percentage of hydrophilic amino acid: 

Percentage of hydrophobic amino acid: 



10 

19.23% 
8 

15.38% 
18 

34.62% 
-2 

5.03 

0.38 

-5.81 

1.47 

S3. 85% 

46.15% 
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Ratio of %hydrophi!ic to %hydrophobic: 



1.17 
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770RF182 sequence 



29268 




atgttcaatataaaacgaaaaacggaggaagtcaagatgtattac 


1 M 


F 


NIK RK TEEVKMYY 


29313 




gaaataggcgaaatcatacgcaaaaatat tcatgt-taacggattc 


16 E 


I 


GE I I RKN I HVNGF 


£* J J ~J O 




gatt ttaagctattcattttaaaaggtcatatgggcatatcaata 


31 D 


F 


KLFI LKGHMGISI 


9 Q A n 




r*a sot - t aaaaa t" at"oaacaacat accaat t aaacatact ta tQtC 


46 Q 


V 


KDM NNVP I KH AYV 


29448 




gtagatgagaatgacttagatatggcatcagacttatttaaccaa 


61 V 


D 


ENDLDMA SDLFNQ 


29493 




gcaatagatgaatggattgaagagaacacagacgaacaggacaga 


76 A 


I 


DE WIEENTDEQD R 


29538 




ctaattaacttagtcatgaaatggtag 2 9564 


91 L 


I 


N L , V M K W * 
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Physico-chemical parameters of ORF 770RF182 

MFNIKRKTEE VKMYYEIGEI IRKNIHVNGF DFKLFILKGH MGISIQVKDM NNVPIKHAYV 
VDENDLDMAS DLFNQAIDEW IEENTDEQDR LINLVMKW 



1 

61 



Number of amino acids: 

Average molecular weight (Daltons): 

Mean amino acid weight (Daltons): 

Monoisotopic molecular weight (Daltons): 

Mean amino acid monoisotopic weight (Daltons): 



98 

11691.50 
119.30 
11683.84 
119.22 



Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.06 
% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


9.18 
% 


5.28% 


Glu 


E 


9 


9.18% 


6.37% 


Phe 


F 


5 


5.10 
% 


4.09% 


Gly 


G 


4 


4.08% 


6.84% 


His 


H 


3 


3.06 
% 


2.24% 


lie 


I 


12 


12.24 
% 


5.81% 


Lys 


K 


9 


9.18 

% 


5.95% 


Leu 


L 


6 


6.12% 


9.42% 


Met 


M 


6 


6.12 
% 


2.37% 


Asn 


N 


9 


9.18% 


4.45% 


Pro 


P 


1 


1.02 
% 


4.9% 


Gin 


Q 


3 


3.06% 


3.97% 


Arg 


R 


3 


3.06 
% 


5.16% 


Ser 


s 


2 


2.04% 


7.12% 


Thr 


T 


2 


2.04 
% 


5.67% 


Val 


V 


7. 


7.14% 


6.58% 


Trp 


W 


2 


2.04 
% 


1.23% 


Tyr 


Y 


3 


3.06% 


3.18% 



Number of acidic (negative) amino acids (ED): 
Number of basic (positive) amino acids (KR): 
Total charge (KRED): 

Net charge (KR - ED): 

6.12% 

Theoritical pi: 

Total linear charge density: 

Average hydrophobicity: 

Ratio of hydrophilicity to hydrophobicity: 



18 

18.37% 
12 

12.24% 
30 

30.61% 
-6 

4.76 
0.33 
-3.89 
1.28 



Percentage of hydrophilic amino acid: 
Percentage of hydrophobic amino acid: 
Ratio of %hydrophilic to %hydrophobic: 
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Table 5 



BLASTP 2.0.8 ( Jan-05-1999] 

Query= sid| 100017 | lan| 77ORF017 Phage 77 ORF | 23269-23982 | -3 
(237 letters) 



Database: nr 

3 93 , 678 sequences; 



120,452,765 total letters 



Score 



Sequences producing significant alignments: 

gi | 44 93 986 | emb| CAB3 9045 . 1 1 (AL034559) predicted using hexExon; . 
gi|730607|sp|P23250|RPIl_YEAST NEGATIVE RAS PROTEIN REGULATOR P, 
gi|3097044 | emb | CAA752 99 | (Y15035) KIR [ Co wpox virus) 
gi|2146245|pir| |S73794 hypothetical protein H91_orfl80 - Mycopl . 
gi j 83910 |pir | |S04682 ribosomal protein varl - yeast (Candida gl . 
gi|l3 313 5|sp|P2135 8|RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN . 
gi|2128843 |pir| |H64475 hypothetical protein MJ1409 - Methanococ. 
gi|5107017|gb|AAD39926.l|AF126285_2 (AF126285) RNA polymerase [. 
gi|2146210 |pir| |S73342 hypothetical protein E07_orfl66 - Mycopl. 



(bits) 


Value 


41 


0.010 


38 


0.053 


38 


0.090 


38 


0.090 


37 


0.15 


37 


0.15 


36 


0.20 


36 


0.35 


35 


0.60 



Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 



sp 
sp 
sp 
sp 
sp 
sp 
sp 
sp 



Score E 
(bits) Value 



|P23250 RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR PROTEIN. 

|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

|Q21444 LDLC_CAEEL LDLC PROTEIN H0MOLOG. , 

|P27240 RFAY_ECOLI L IPO POLYSACCHARIDE CORE BIOSYNTHESIS PROT . 

|P53192 YGC0_YEAST HYPOTHETICAL 27.1 KD PROTEIN IN ALK1-CKB1. 

|P32908 SMC1_YEAST CHROMOSOME SEGREGATION PROTEIN SMC1 (DA-B. 

|P54683 TAGBJDICDI PRESTALK- SPECIFIC PROTEIN TAGB PRECURSOR . 

|Q03100 CYAAJDICDI ADENYLATE CYCLASE, AGGREGATION SPECIFIC (. 



38 
37 
34 
33. 
33 
33 
32 
32 



014 

040 

35 

46 

60 

60 

78 

78 
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BLASTP 2.0.8 [Jan-05-1999] 

Query= sid 1 100019 | lan | 77ORF019 Phage 77 ORF| 39851-40501 1 2 
(216 letters) 

Database: nr 

373,355 sequences; 114:214,446 total letters 



Sequences producing significant alignments: 



Score 
(bits) 



E 

Value 



gi|3341966|dbj |BAA31932| (AB009866) orf 59 {bacteriophage phi PVL] 437 e-122 
gi | 2689911 (AE000792) B. burgdorferi predicted coding region BB. 
gi|ll71589|emb|CAA64574| (X95275) frameshift [Plasmodium falcip. 
gi|4493986|emb|CAB39045.l| (AL034S59) predicted using hexExon; . 
gi|l41257|sp|P18019|YPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (OR. 
gi | 133412 | Sp j P27059 | RPOB_ASTLO DNA-DIRECTED RNA POLYMERASE BETA. 
gi|312223l|sp|Q5885l|HISX_METJA HISTIDINOL DEHYDROGENASE (HDH) . 
gi|3649757|emb|CAB11106.l| (Z98547) predicted using hexExon; MA. 
gi | 2688313 (AE001146) sensory transduction histidine kinase, pu. 



437 


e-122 


38 


0.058 


37 


0.10 


36 


0.23 


36 


0.29 


35 


0.51 


35 


0.51 


34 


0.66 


34 


0.87 



Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 

sp|P18019 YPI9_CLOPE HYPOTHETICAL 14^5 KD PROTEIN (ORF9) . 
sp|Q58851 HISX_METJA HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H. 
sp|P27059 RPOB_ASTLO DNA-DIRECTED RNA POLYMERASE BETA CHAIN (E. 
sp|Q02224 CENE_HUMAN CENTROMERIC PROTEIN E (CENP-E PROTEIN). 
sp|P04931 ARP_PLAFA ASPARAGINE-RICH PROTEIN (AG319) (ARP) (FRA. 
sp|P18011 IPAB_SHIFL 62 KD MEMBRANE ANTIGEN. 

sp|P18709 VTA2__XENLA VITELLOGENIN A2 PRECURSOR (VTG A2) [CONTA. 
sp|Q64409 CP3H_CAVPO CYTOCHROME P450 3A17 (EC 1.14.14.1) (CYPI. 
sp|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 
sp|Q03 94 5 IPAB_SHIDY 62 KD MEMBRANE ANTIGEN. 



Score 




E 


(bits) 




Val 


36 


0 


.079 


35 


0 


.14 


35 


0 


.14 


34 


0 


.31 


33 


0 


.53 


32 


0 


.69 


32 


0 


.90 


32 


0 


.90 


32 


0 


.90 


32 


1 


.2 
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BLASTP 2.0.8 [ Jan-05 - 1999] 

Query* sid| 100043 | lan| 77ORF043 Phage 77 ORF | 29304 -29564 | 3 
(86 letters) 

Database: nr 

373,355 sequences; 114;214,446 total letters 



Sequences producing significant alignments: 

gi|3341947|dbj|BAA31913| (AB009866) orf 39 [bacteriophage phi PVL] 
gi|744518|prf | (2014422A FKBP-rapamycin-associated protein [Homo. 
gi| 1169736 |sp|P42346|FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN. 
gi| 1169735 |sp|P42345|FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTE . 
gi|3282239 (U88966) rapamycin associated protein FRAP 2 [Homo sa . 
gi|3875402|emb|CAA98122| (Z73906) CDNA EST EMBL:D64544 comes fr. 
gi|l084792|pir| |S54091 hypothetical protein YPR070w - yeast (Sa. 



Score 
(bits) 



Database: swissprot 

79,449 sequences; 



182 
32 
32 
32 
32 
31 
30 



E 

Value 

,6e-46 
0.84 
0.84 
0.84 
0.84 
2.5 
4.2 



28,874,452 total letters 



Sequences producing significant alignments: 



sp| P42345 
Sp|P42346 
Sp|P34554 
sp|Q24118 
spj P80034 
sp|P22922 
sp|Q44363 
sp|P38255 
sp|P55822 
sp|Q58482 
sp|P34252 



FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 
FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 
YNP1_CAEEL HYPOTHETICAL 42.2 KD PROTEIN T0SG5.1 IN C. 
LIOJDROME LINOTTE PROTEIN. 

ACH2_BOMMO ANTICHYMOTRYPSIN II (ACHY-II). 

AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT) . ■ 

TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 

YBU5_YEAST HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1 . 

SH3B_HUMAN SH3BGR PROTEIN (21-GLUTAMIC ACID-RICH PRO. 

YA82_METJA HYPOTHETICAL PROTEIN MJ1082. 

YKK8 YEAST HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AAT1. 



Score 


E 


(bits) 


Value 


32 


0.24 


32 


0.24 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


27 


6.0 


27 


7.9 


27 


7.9 


27 


7.9 
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BLASTP 2.0.8 [Jan-05-1999] 

Query* sid| 100102 | lan | 77ORF102 Phage 77 ORF | 29051-29212 | 2 
(53 letters) 

Database: nr 

373,355 sequences; 114:214,446 total letters 



Sequences producing significant alignments: 



Score 
(bits) 



oil 3341946 1 dbj |BAA31912| (AB009866) orf 38 [bacteriophage phi PVL] 96 
gi|4325288|gb|AAD17315| (AF123593) voltage-dependent sodium cha. . . 28 
gi | 2649684 (AE001040) A. fulgidus predicted coding region AF092... 28 



Database : swissprot 

79,449 sequences; 28,874,452 total letters 

Sequences producing significant alignments: 

sp|P42087 HUTM BACSU PUTATIVE HISTIDINE PERMEASE . 26 
sp P04775 CIN2~RAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUBU. . . 26 
sp P42619 YQJF'eCOLI HYPOTHETICAL 17.2 KD PROTEIN IN EXUR-TDCC. . . 26 



E 

Value 

3e-20 

7.1 

9.3 



Score E 
(bits) Value 



7.1 
9.2 
9.2 
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BLASTP 2.0.8 ( Jan- 0 5 - 1 999 ] 

Query= sid| 100104 | lan | 77ORF104 Phage 77 ORF| 34393-34551 1 1 
(52 letters) 



Database: nr 



373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 

r- 

gi | 2315523 (AF016452) similar to the leucine-rich domains found. 
gi|4377168|gb|AAD18990| (AE001666) CT711 hypothetical protein [. 
gi|388217l|dbj |BAA34445| (AB018268) KIAA0725 protein [Homo sapi . 



Score E 
(bits) Value 



29 
29 
28 



4.2 
5.4 
9.3 



Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 



Score 
(bits) 



E 

Value 



sp|P04879 
sp|P04880 
sp|Q13946 
sp|P35381 
sp|P54659 
sp|P40397 



RRPP_VSVIG RNA POLYMERASE ALPHA SUB UN IT (EC 2.7.7.48. 27 5.4 

RRPPJ/SVIM RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

CN7A__HUMAN HIGH-AFFINITY CAMP-SPECIFIC 3 5 ' -CYCLIC . 26 7.1 

ATPA_DROME ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P. 26 9.3 

MVPB_DICDI MAJOR VAULT PROTEIN BETA (MVP- BETA) . 26 9.3 

YHXC BACSU HYPOTHETICAL OXIDOREDUCTASE IN APRE-COMK . 26 9.3 
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BLASTP 2.0.8 [Jan-05-1999] 

Query= sid | 122748 | lan| 770RF182 Phage 77 ORF| 29268-29564 | 3 
(98 letters) 

Database: nr 

393,678 sequences; 120,452,765 total letters 



Sequences producing significant alignments: 



gijl084792 jpirj |S54091 hypothetical protein YPR070w - yeast (Sa. 
gij 1169736 |sp|P42346|FRAP_RAT FKBP -RAPAMYCIN ASSOCIATED PROTEIN. 
gi|744518|prf | |2014422A FKBP- rapamycin-associated protein [Homo. 
gi|505138l|emb|CAB44736.l| (AL049653) dJ647M16.2 (FK506 binding, 
gij 4826730 jref |NP_004 949 . 1 | pFRAPl | FK506 binding protein 12-rap. 
gi|3282239 (U88966) rapamycin associated protein FRAP 2 (Homo sa. 

Database: swissprot 

79,909 sequences; 29,054,478 total letters 



Sequences producing significant alignments: 

Sp|P42345 FRAP_HUMAN FKBP -RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 
Sp|P42346 FRAP_RAT FKBP -RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 
Sp|P40557 YIA5_YEAST PUTATIVE DISULFIDE ISOMERASE YIL005W PREC. 
sp|Q24118 LIO_DROME LINOTTE PROTEIN. 

sp|Q44363 TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 
sp|P80034 ACH2_BOMMO ANT I CHYMOTR Y P S I N II (ACHY-II) . 
sp|P34554 YNP1_CA£EL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 
sp|P22922 AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT). 



Score 


E 




vox uc 


. 182 


8e-46 


J D 


u . X J 




X . X 


J £. 


X . X 


J Z 


1 1 
X . X 




X • X 


"i *3 
J £ 


• X . X 


Score 


E 


(bits) 


Value 


32 


0.29 


32 


0.29 


29 


3.3 


28 


4.4 


28 


4.4 


28 . 


4.4 


28 


4.4 


28 


4.4 



Table 6 
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nosition 




2nd position 




position 


(5' end) 
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L» 


A 


G 


13' end) 




Qua 
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Ser 


Tyr 


Cys 


1 1 

u 


u 


Phe 
Leu 


Ser 
Ser 


Tyr 
Stop 


Cys 
Stop 


c 

A 




i pti 

LCU 


Ser 


Stoo 


Tro 


G 




1 PI t 
LCU 


Pro 


His 


Am 


u 


c 


Leu 


Pro 


His 


Arg 


C 


Leu 


Pro 


Gin 


Arg 


A 




Leu 


Pro 


Gin 


Ara 


G 
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Thr 
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Ser 


U 


A 


lie 


Thr 


Asn 


Ser 


C 


He 


Thr 


Lys 


Arg 


A 




Met 


Thr 


Lvs 


Ara 


G 




Val 


Ala 
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Gly 
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Val 
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Ala 
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Glu 


Gly 
Gly 


C 
A 
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G 
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Table 7 

Bacteriophage 3A f complete genome sequence 

1 caaacgctag caacgcggat aaatttttca tgaaaggggg tctttatatg aagttaacaa aaaaacagct 
71 aaaagaatat atagaagatt acaaaaaatc tgatgacata ttaattaatt tgtatataga aacatatgaa 
141 ttttattgtc ggttaagaga tgaacttaaa aatagtgatt taatgataga gcatacaaac aaggctggtg 
211 cgagcaatat tattaagaat ccattaagca tagaactgac aaaaacagtt caaacactaa ataacttact 
281 caagtctatg ggtttaactg cagcacaaag aaaaaagata gttcaagaag aaggtggatt cggtgactat 
3S1 taaagtttta aatgaacctt caccaaaact attaacaaca tggtatgcag agcaagtcac tcaagggaaa 
421 ataaaaacaa gcaaatatgt tagaaaagaa tgtgagagac atcttagata tctagaaaat ggaggtaaat 
4 91 gggtatttga tgaagaatta gcgcatcgtc ctattcgatt tatagaaaag ttttgtaaac cttccaaagg 
561 atctaaacgt caacttgtat tacagccatg gcaacatttt attatcggca gtttgtttgg ttgggttcat 
631 aaagaaacaa aactgcgcag gtttaaagaa gctttgatat ttatggggcg aaaaaatggt aaaacaacca 
701 ctatttctgg ggttgctaac tatgctgtat cacaagatgg agaaaatggt gcagaaattc atttgttagc 
771 aaacgtaatg aaacaagcta ggattctatt tgatgaatct aaggcgatga ttaaagctag cccaaagctt 
841 gataaaaatt tcagaacatt aagagatgaa atccattatg acgcaacgat atcaaaaatt atgccccaag 
911 catcagatag cgataagtta gatggattga atacacacat ggggattttt gatgaaattc atgaatttaa 
981 agactataaa ttgatttcag ttataaaaaa ctcaagagct gcaaggttac aacctcttct catctacatt 
1051 acgacagcag ggtatcaatt agatggtcca cttgttgata tggtagaagc gggaagagac accttagatc 
1121 aaatcataga agacgaaaga actttttatt atttagcatc tttggatgat gacgatgata ttaatgattc 
1191 . gtcgaactgg ataaaagcaa atcccaactt aggtgtctct ataaatttag atgagatgaa agaagagtgg 
1261 gaaaaagcta agagaacacc agctgaacgt ggagatttta taaccaaaag gtttaatatc tttgctaata 
1331 atgacgagat gagttttatt gattacccaa cactccaaaa aaataatgaa attgtttctt tagaagagct 
1401 ggaaggcaga ccgtgcacga ttggttatga tttatcagaa acagaggact ttacagccgc gtgtgctact 
1471 tttgcgttag ataatggtaa agttgcagtt ttatcgcatt catggattcc taagcacaaa gttgaatatt 
1541 ctaacgaaaa aataccctat agagaatggg aagaagatgg cttattaaca gtgcaagata agccttatat 
1611 tgactaccaa gatgttttaa attggataat taagatgaat gagcattatg tagtagaaaa aattacttat 
1681 gatagagcga acgcattcaa actaaatcaa gagttaaaaa attacgggtt tgaaacggaa gaaacaagac 
1751 aaggagcttt gaccttgagc cctgcattga aggatttaaa agaaatgttt ttagatggga aaataatatt 
1821 taataataat cctttaatga aatggtatat caataatgtt cagttgaaac tagacagaaa cggaaactgg 
1891 ttgccgtcta agcaaagcag atatcgtaaa atagatggct ttgcagcatt tttaaacaca tatacagata 
1961 ttatgaataa agttgtttct gatagtggtg aaggaaacat agagtttatt agtattaaag acataatgcg 
2031 ttaaggaggt gaatgttatc gcaaaagaga atattgtcac acgcataaag aaaaaattga tagacaattg 
2101 gattgatcag tcaacttcta agctttatga ctttagccca tggaaaaata gatctttttg gggtgtaatt 
2171 aataatacgc ttgaaactaa tgaaacgata ttttcagcta ttacaaagtt atctaattcg atggctagtt 
2241 tgcccttgaa aatgtatgaa gattataaag tagttaatac agaagtatct gatttactta cagtgtcacc 
2311 gaataattct ctgagcagtt ttgattttat taatcaaatt gaaacaatca gaaatgaaaa aggtaatgca 
2381 tatgtgctaa ttgaacgaga catctatcat caaccatcaa agcttttctt attaaatcca gatgttgttg 
2451 aaatgttaat tgaaaaccaa tcacgtgaac tttattattc cattcatgct gcaactggaa ataaattgat 
2521 tgttcataat atggacatgt tgcattttaa acacatcgtg gcatctaata tggtgcaagg cattagtccg 
2591 attgatgtgt tgaagaatac aactgatttt gataatgcag taagaacctt taatcttaca gaaatgcaaa 
2661 aacctgattc tttcatgctt aaatatggtt ccaatgtagg taaagaaaaa aggcagcaag tgttagaaga 
2731 tttcaaacag tactatgaag aaaacggtgg aatattattc caagagcctg gtgttgaaat cgaaccgtta 

2 801 cctaaaaaat atgtctctga agatatagtg gcaagcgaga atttaacaag agaaagagta gctaacgttt 
2871 ttcaattgcc ctcagtattc ttaaatgcaa gatcaaatac aaatttcgcg aaaaatgaag agttaaacag 
2941 attttacttg cagcatacct tattgccaat cgtcaaacag tatgaagaag aatttaatcg gaaactactt 
3011 actaaaacag acagagaaaa aaataggtat tttaaattta acgttaaatc ttatttaagg gctgatagtg 
3081 caacacaagc agaagtgtac tttaaagcag ttcgtagtgg ttactacact ataaatgaca ttagagagtg 
3151 ggaagattta ccaccagttg aaggtggaga taagccgcta ataagcggtg atttataccc aattgacacg 
3221 ccacttgaat taagaaaatc tttgaaaggt ggtgataaaa atgtcaatga aagctaagta ttttcaaatg 
3291 aaaagaaaat caaaaagtaa aggtgaaata tttatttatg gtgatattgt aagtgataaa tggtttgaaa 
3361 gtgatgtaac tgctacagat ttcaaaaata aactagatga actaggagac atcagtgaaa tagatgttca 
3431 tataaattca tctggaggca gtgtatttga agggcatgca atatacaata tgctaaaaat gcatcctgca 
3501 aaaattaata tctatgtcga tgccttagcg gcatcaattg ctagtgttat cgctatgagt ggtgacacta 
3571 tttttatgca caaaaatagt tttttaatga ttcataattc atgggttatg actgtaggta atgcagaaga 

3 641 gttaagaaag acagcggatt tacttgaaaa aacagatgct gttagtaatt cagcttattt agataaagca 
3711 aaagatttag atcaagaaca cttaaaacag atgttagatg cagaaacttg gcttactgca gaagaagcct 
3781 tgtctttcgg cttgatagat gaaattttag gagctaatga aataactgct agtatctcta aagagcaata 
3851 taagcgtttc gagaacgtcc cagaagattt aaagaaagat gtagacaaaa tcactaaaat cgatgatgta 
3921 gatacgtttg aattggttga aacacctaaa gaaagtatgt cactagaaga aaaagaaaaa agagaaaaaa 
3991 ttaaacgcga atgcgaaatt ttaaaaatga caatgagtta ttaggaggaa atgaaatgcc gacattatat 
4061 gaattaaaac aatccttagg tatgattgga caacaattaa aaaataaaaa tgatgaattg agtcagaaag 
4131 caacagaccc aaatattgat atggaagaca tcaaacaact agaaacagaa aaagcaggct tacaacaaag 

4 201 atttaacatt gttgaaagac aagtaaaaga cattgaagaa aaagaaaaag cgaaagttaa agacacagga 
4271 gaagcttatc aatctttaaa tgatcatgag aagatggtta aagctaaggc agagttttat cgtcacgcga 
4341 ttttaccaaa tgaatttgaa aaaccttcaa tggaggcaca acgtttatta cacgctttac caacaggtaa 
4411 tgattcaggt ggtgataagc tcttaccaaa aacactttct aaagaaattg tttcagaacc atttgctaaa 
4481 aaccaattac gtgaaaaagc tcgtctaact aacattaaag gtttagagat tccaagagtt tcatatactt 
4551 tagacgatga tgacttcatt acagatgtag aaacagcaaa agaattaaaa ttaaaaggtg atacagttaa 
4621 attcactact aataaattca aagtatttgc tgcaatttca gatactgtaa ttcatggatc agatgtagat 
4691 ttagtaaact gggttgaaaa cgcactacaa tcaggtctag cagctaaaga acgtaaagat gccttagcag 
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4761 taagtcctaa atctggatta gatcacatgt cactttacaa tggatctgtt aaagaagttg agggagcaga 

4831 catgtatgat gctattatta acgctttagc agatttacat gaagattacc gtgataacgc aacaatttat 

4901 atgcgatatg cggattatgt caaaattatt agtgttcttt caaatggaac aacaaatttc tttgacacac 

4971 cagcagaaaa agtatttggc aaaccagtag tatttacaga tgcagcagtt aaacctattg tgggagattt 

5041 caattatttt ggaattaact atgatggaac aacttatgac actgataaag atgttaaaaa aggcgaatat 

5111 ttgtttgtat taactgcatg gtatgatcag caacgtacat tagacagtgc attcagaatt gcaaaagcaa 

5181 aagaaaatac aggttcatta cccagctaag ccccaaaagg ttaatgtaac agctaaggct . aaatcagctg 

5251 taatatcagc cgaatagggg tgatgaaatg agtttagaag aaattaaatt gtggttgaga attgactata 

53 21 atttcgaaaa tgatttaatt gaaggtctca ttcaatcggc taagtctgaa ttactattaa gtggggttcc 

53 91 agattatgac aaagatgact tggaataccc gcttttttgt acagcgatta gatatatcat tgcaagagat 

54 61 tatgaaagtc gtgggtactc aaatgaccaa tctagaagca aggtttttaa tgaaaaggga ttgcaaaaaa 
5531 tgattctgaa attaaaaaag tggtaggtga tttttaaatg gaatttaatg aatttaaaga tcgcgcatat 
5601 ttttttcaat atgtaaataa agggccgtat ccagatgaag aggaaaaaat gaagttgtat agttgctttt 
56 71 gtaaaatata taatccttct atgaaagata gagaaatctt aaaagcgact gaatcaaagt caggactaac 
5741 cataattatg aggtcttcta aaattgaata tctaccacaa acaaatcact tagttaaaat tgacagaggc 
5811 ttatattccg ataaattatt caacattaaa gaaataagaa ttgatacacc agatattggc tataatacag 
58 81 tggttttatc agaaaaatga gtgtagaaat taaagggata cctgaagtgt tgaagaaatt agaatcggta 
5951 tacggtaaac aatcaatgca agctaagagt gatagagctt taaatgaagc atctgaattt tttataaagg 
6021 ctttaaagaa agaattcgag agttttaaag atacgggtgc tagcatagaa gaaatgacta aatctaagcc 
6091 ttatacaaaa gtaggaagtc aagaaagagc tgttttaatt gaatgggtag gccctatgaa tcgcaaaaac 
6161 attattcact tgaatgaaca tggttataca agagatggaa aaaaatatac accaagaggt tttggagtta 
6231 ' tcgcaaaaac attagctgct aatgaacgga agtatagaga aattataaaa aaggagttgg ccagataaat 
6301 gaatatatta aacaccataa aagaaatttt attatctgat gcagagctcc aaacatatat aaattctaga 
6371 atatactatt ataaagtcac tgaaaatgct gaaacttcca aaccttttgt tgttattaca cctatttatg 
6441 atttaccttc agacttcatg tctgataaat atcttagtga agaatactta attcaaatag atgtagaatc 
6511 ttcaaataat cagaaaacaa ttgatataac aaaacgaata agatatctgt tatatcaaca aaatttaatt 
6581 caagcatcta gtcagttaga tgcttatttt gaagaaacta aacgttatgt gatgtcgaga cgttatcaag 
66 51 gcataccaaa aaatatatat tataaaaatc agcgcatcga ataggtgtgc tttttaattt ttaaggagga 
6721 aataagcaat ggcagaagga caaggttctc ataaagtagg ttttaaaaga ttatacgttg gagtttttaa 
6791 cccagaagca acaaaagtag ttaaacgcat gacatgggaa gatgaaaaag gtggtacagt tgatctaaat 
6861 atcacaggtt tagcaccaga tttagtagat atgtttgcat ctaacaaacg tgtttggatg aaaaaacaag 
6931 gtactaatga agttaagtct gacatgagta tttttaatat tccaagtgaa gatctaaata cagttattgg 
7001 tcgttctaaa gataaaaatg gtacatcttg ggtaggagag aatacaagag caccatacgt aacagttatt 
7071 ggagaatctg aagatggttt aacaggtcaa ccagtgtacg ttgcgctact taaaggtact tttagcttgg 
7141 attcaattga atttaaaaca cgaggagaaa aagcagaagc accagagcca acaaaattaa ctggtgactg 
7211 gatgaacaga aaagttgatg ttgatggtac tccacaaggt attgtatacg ggtatcatga aggtaaagaa 
7281 ggagaagcag aattcttcaa aaaagtattc gttggataca cggacagtga agatcattca gaggattctg 
7351 caagttcgtt acccagctaa cccccaaaat gttgaagtag cagttaattc aaaatctgca acagtttcag 
7421 cagaataggg gctttcaaaa taaatcaaag gagaataatt tatgactaaa actttaaagg tttataaagg 
7491 agacgacgtc gtagcttctg aacaaggtga aggcaaagtg tcagtaactt tatctaattt agaagcggat 
7561 acaacttatc caaaaggtac ttaccaagtg gcatgggaag aaaatggtaa agaatctagt aaagttgatg 
7631 tacctcaatt caaaaccaat ccaattctag tctcaggcgt atcatttaca cccgaaacta aatcaatcac 
7701 ggtaaatgct gatgacaatg ttgaaccaaa cattgcacca agtacagcaa cgaataaaac gttgaaatat 
7771 acaagtgaac atccagagtt tgttactgtt gatgagagaa caggagcaat tcacggtgta gctgagggaa 
7841 cttcagttat cactgctacg tctactgacg gaagtgacaa gtctggacaa attacagtaa cagtaacaaa 
7911 tggataatta tttgagacgc agaatatctg cgtctttttt atttgaataa aaggagctaa tacaatgatt 
7981 aaatttgaaa ttaaagaccg taaaacagga aaaacagaga gctatacaaa agaagatgtg acaatgggcg 
8051 aagcagaaaa atgctatgag tatttagaat tagtaaatca agagaataaa aaagaagtac ctaacgcaac 
8121 aaaaatgaga caaaaagagc gacagttatt agtagattta tttaaagatg aaggattgac tgaagaagat 
8191 gttttgaaca agatgagcac taaaacttat acaaaagcct tgaaagatat atttcgagaa atcaatggtg 
8261 aagatgaaga agattcagaa actgaaccag aagagatggg aaagacagaa gaacaatctc aataaaagat 
8331 attttatcga acattaagaa aatacaacgt ttctgtatgg agcagtatgg gtggacatta actgaagtca 
8401 gaaaacagcc gtatgtaaaa cttttagaaa tacttaatga agagaataaa gaagagactg aagaaaaaca 
8471 aagtgaacaa aaagtcatta caggtacgga tttaagaaaa ctttttggaa gctagaaagg aggttaatat 
8541 gaatgaaaaa gtagaaggca tgaccttgga gctgaaatta gaccatttag gtgtccaaga aggcatgaag 
8611 ggtttaaagc gacaattagg tgttgttaat agtgaaatga aagctaatct gtcatcattt gataagtctg 
8681 aaaaatcaat ggaaaagtat caggcgagaa ttaaggggtt aaatgataag cttaaagttc aaaaaaagat 
8751 gtattctcaa gtagaagatg agcttaaaca agttaacgct aattatcaaa aagctaaatc tagtgtaaaa 
8821 gatgttgaga aagcatattt aaagctagta gaagctaata aaaaagaaaa attagctctt gataaatcta 
8891 aagaagcctt aaaatcttcg aatacagaac ttaaaaaagc tgaaaatcaa tataaacgta caaatcaacg 
8961 taaacaagat gcatatcaaa aacttaaaca gttgagagat gcagaacaaa agcttaagaa tagtaaccaa 
9031 gctactactg cacaactaaa aagagcaagt gacgcagtac agaagcagtc cgctaagcat aaagcacttg 
9101 ttgaacaata taaacaagaa ggcaatcaag ttcaaaaact aaaagtacaa aatgataatc tttcaaaatc 
9171 aaacgaaaaa atagaaaatt cttacgctaa aactaatact aaattaaagc aaacagaaaa agaatttaat 
9241 gatttaaata atactattaa gaatcatagc gctaatgtcg caaaagctga aacagctgtt aacaaagaaa 
9311 aagctgcttt aaataattta gagcgttcaa tagataaagc ttcatccgaa atgaagactt ttaacaaaga 
9381 acaaatgata gctcaaagtc atttcggcaa acttgctagt caagcggatg tcatgtcaaa gaaatttagt 
9451 tctattggag ataaaatgac ttccctagga cgtacgatga cgatgggcgt atctacaccg attactttag 
9521 ggttaggtgc agcattaaaa acaagtgcag acttcgaagg gcaaatgtct cgagttggag cgattgcaca 
9591 agcaagcagt aaagacttaa aaagcatgtc taatcaagcg gttgacttag gcgctaaaac aagtaaaagt 
9661 gctaacgaag ttgctaaagg tatggaagaa ttggcagctt taggctttaa tgccaaacaa acaatggagg 
9731 ctatgccggg tgttatcagt gcagcagaag caagcggtgc agaaatggct acaactgcaa ctgtaatggc 
9801 atcagcaatt aattctttcg gtttaaaagc atctgatgca aaccatgttg ctgatttact tgcgagatca 
9871 gctaatgata gtgctgcaga tattcaatac atgggagatg cattaaaata tgcaggtact ccagcaaaag 
9941 cattaggagt ttcaatagag gacacttctg cagcaattga agttttatct aactcagggt tagaggggtc 
10011 tcaagcaggt actgcattaa gagcttcgtt tattaggcta gctaatccaa gtaaaagtac agctaaggaa 
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10081 atgaaaaaat taggtattca tttgtctgat gctaaaggtc aatttgttgg catgggtgaa ttgattagac 

10151 agttccaaga caacatgaaa ggcatgacga gagaacaaaa actagcaaca gtggctacaa tagttggcac 

10221 tgaagcagca agtggatttt tagccttgat tgaagcgggt ccagataaaa ttaatagcta tagcaaatca 

10291 ttgaagaact ctaatggtga aagtaaaaaa gcagctgatt tgatgaaaga caacctcaaa ggtgctctgg 

10361 aacaattagg tggcgctttt gaatcgttag caattgaagt tggtaaagat ttaacgccta tgattagagc 

10431 aggtgcggaa ggattaacaa aattagttga tggatttaca catcttcctg gttggtttag aaaggcttcg 

10501 gtaggtttag cgatttttgg tgcatctafct ggccctgctg ttcttgctgg tggcttatta atacgtgcag 

10571 ttggaagcgc ggc.taaaggc tatgcatcat taaatagacg cattgctgaa aatacaatac tgtctaatac 

10641 caattcaaaa gcaatgaaat ctttaggtct tcaaacctta tttcttggtt ctacaacagg aaaaacgtca 

10711 aaaggcttta aaggattagc cggagctatg ttgtttaatt taaaacctat aaatgttttg aaaaattctg 

10781 caaagctagc aattttaccg ttcaaacttt tgaaaaacgg tttaggatta gccgcaaaat ccttatttgc 

10851 agtaagtgga ggcgcaagat ttgctggtgt agccttaaag tttttaacag gacctatagg tgctacaata 

10921 actgctatta caattgcata taaagttttt aaaaccgcat atgatcgtgt ggaatggttc. agaaacggta 

10991 ttaacggttt aggagaaact ataaagtttt ttggtggcaa aattattggc ggtgctgtta ggaagctagg 

11061 agagtttaaa aattatcttg gaagtatagg caaaagcttc aaagaaaagt tttcaaagga tatgaaagat 

11131 ggttataaat ctttgagtga cgatgacctt ctgaaagtag gagtcaacaa gtttaaagga tttatgcaaa 

11201 ccatgggcac agcttctaaa aaagcatctg atactgtaaa agtgttgggg aaaggtgttt caaaagaaac 

11271 agaaaaagct ttagaaaaat acgtacacta ttctgaagag aacaacagaa tcatggaaaa agtacgttta 

11341 aactcgggtc aaataacaga agacaaagca aaaaaacttt tgaaaattga agcggattta tctaataacc 

11411 ttatagctga aatagaaaaa agaaataaaa aggaactcga aaaaactcaa gaacttattg ataagtatag 

11481 tgcgttcgat gaacaagaaa agcaaaacat tttaactaga actaaagaaa aaaatgactt gcgaattaaa 

11S51 aaagagcaag aactcaatca gaaaatcaaa gaattgaaag aaaaagcttt aagtgatggt cagatttcag 

11621 aaaatgaaag aaaagaaatt gaaaagcttg aaaatcaaag acgtgacatc actgttaaag aattgagtaa 

11691 gactgaaaaa gagcaagagc gtattttagt aagaatgcaa agaaacagaa atgcttattc aatagacgaa 

11761 gcgagcaaag caattaaaga agcagaaaaa gcaagaaaag caagaaaaaa agaagtggac aagcaatatg 

11831 aagatgatgt cattgctata aaaaataacg tcaacctttc taagtctgaa aaagataaat tattagctat 

11901 tgctgatcaa agacataagg atgaagtaag aaaggcaaaa tctaaaaaag atgctgtagt agacgttgtt 

11971 aaaaagcaaa ataaagatat tgataaagag atggatttat ccagtggtcg tgtatataaa aatactgaaa, 

12041 agtggtggaa tggccttaaa agttggtggt ctaacttcag agaagaccaa aagaagaaaa gtgataagta 

12111 cgctaaagaa caagaagaaa cagctcgtag aaacagagaa aatataaaga aatggtttgg aaatgcttgg 

12181 gacggcgtaa aaactaaaac tggcgaagct tttagtaaaa tgggcagaaa tgctaatcat tttggcggcg 

122S1 aaatgaaaaa aatgtggagt ggaatcaaag gaattccaag caaattaagt tcaggttgga gctcagccaa 

12321 aagttctgta ggatatcaca ctaaggctat agctaatagt actggtaaat ggtttggaaa agcttggcaa 

12391 tctgttaaat cgactacagg aagtatttac aatcaaacta agcaaaagta ttcagatgcc tcagataaag 

12461 cttgggcgca ttcaaaatct atttggaaag ggacatcaaa atggtttagc aatgcatata aaagtgcaaa 

12531 gggctggcta acggatatgg ctaataaatc gcgctcgaaa tgggataata tttctagtac agcatggtcg 

12601 aatgcaaaat ccgtttggaa aggaacatcg aaatggttta gtaactcata caaatcttta aaaggttgga 

12671 ctggagatat gtattcaaga gcccacgatc gttttgatgc aatttcaagt tcggcatggt ctaacgctaa 

12741 atcagtattt aatggtttta gaaaatggct atcaagaaca tatgaatgga ttagagatat tggtaaagac 

12811 atgggaagag ctgcggctga tttaggtaaa aatgttgcta ataaagctat tggcggttta aatagcatga 

12881 ttggcggtat taataaaata tctaaagcca ttactgataa aaatctcatc aagccaatac ctacattgtc 

12951 tactggtact ttagcaggaa agggtgtagc taccgataat tcgggagcat taacgcaacc gacatttgct 

13021 gtattaaatg atagaggttc tggaaacgcc ccaggtggtg gagttcaaga agtaattcac agggctgacg 

13091 gaacattcca tgcaccccaa ggacgagatg tggttgttcc actaggagtt ggagatagtg taataaatgc 

13161 caatgacact ctgaagttac agcggatggg tgttttgcca aaattccatg gtggtacgaa aaagaaagat 

13231 tggctagacc aacttaaagg taatataggt aaaaaagcag gagaatttgg agctacagct aaaaacacag 

13301 cgcataatat caaaaaaggt gcagaagaaa tggttgaagc agcaggcgat aaaatcaaag atggtgcatc 

13371 ttggttaggc gataaaatcg gcgatgtgtg ggattacgta caacatccag ggaaactagt aaataaagta 

13441 atgtcaggtt taaatattaa ttttggaggc ggactaacgc tacagtaaaa attgctaaag gcgcgtactc 

13511 attgctcaaa aagaaattaa tagacaaagt aaaatcgtgg tttgaagatt ttggtggtgg aggcgatgga 

13581 agctatctat ttgaatatcc aatctggcaa agatttggac gctacacagg tggacttaac tttaatgacg 

13651 gtcgtcacta tggtatagac tttggtatgc ctactggaac aaacgtttat gccgttaaag gtggtatagc 

13721 agataaggta tggactgatt acggtggcgg taattctata caaattaaga ccggtgctaa cgaatggaac 

13791 tggtatatgc atttatctaa gcaattagca agacaaggcc aacgtattaa agctggtcaa ctgataggga 

13 861 aatcaggtgc tacaggtaat ttcgttagag gagcacactt acatttccaa ttgatgcaag ggtcacatcc 

13 931 agggaatgat acagctaaag atccagaaaa atggttgaag tcacttaaag gtagtggcgt tcgaagtggt 

14 001 tcaggtgtta ataaggctgc atctgcttgg gcaggcgata tacgtcgtgc agcaaaacga atgggtgtta 
14 071 atgttacctc gggtgatgta ggaaatatca ttagcttgat tcaacacgaa tcaggaggaa atgcaggtat 
14141 aactcaatct agttcgctta gagacatcaa cgttttacag ggcaatccag caaaaggatt gcttcaatat 
14 211 atcccacaaa catttagaca ttatgctgtt agaggtcaca acaatatata tagtggttac gatcagttat 
14 281 tagcgttctt taacaacaga tattggcgct cacagtttaa cccaagaggt ggttggtctc caagtggtcc 
14 351 aagaagatat gcgaatggtg gtttgattac aaagcatcaa cttgctgaag tgggtgaagg agataaacag 
14 421 gagatggtta tccctttaac tagacgtaaa cgagcaattc aattaactga acaggttatg cgcatcatcg 
14 4 91 gtatggatgg caagccaaat aacatcactg taaataatga tacttctaca gttgaaaaat tgttgaaaca 
14 561 aattgttatg ttaagtgata aaggaaataa attaacagat gcattgattc aaactgtttc ttctcaggat 
14 631 aataacttag gttctaatga tgcaattaga ggtttagaaa aaatattgtc aaaacaaagt gggcatagag 
14 701 caaatgcaaa taattatatg ggaggtttga ctaattaatg caatcttttg taaaaatcat agatggttac 
14 771 aaggaagaag taacaacaga ttttaatcag cttatatttt tagatgcaag ggctgaaagt ccaaacacca 
14841 atgataacag tgtaactatt aacggagtag atggtatttt accgggcgca attagttttg cgcctttttc 
14 911 attagtatta aggtttggct atgatggtat agatgttata gatttaaatt tatttgagca ttggtttaga 
14 981 tctgtgttta atcgcagaca tccttattat gttattactt ctcaaatgcc tggtgttaaa tatgcagtga 
15051 atacagctaa tgttacatct aatttaaaag atggttcttc aactgaaatt gaagtaagtt taaatgttta 
15121 taaagggtat tctgaatcag ttaattggac cgatagcgag ttcttattcg actctaattg gatgtttgaa 
15191 aatggaattc ctcttgattt cacacctaaa tatactcata catcaaatca atttactatt tggaacggtt 
15261 ctactgatac gataaatcca cgattcaagc acgatttgaa aatattaatt aatttaaatg cgagtggagg 
15331 atttgaactg gttaactata caacaggtga tatttttaag tacaacaaaa gtatagataa aaacactgat 
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15401 tttgttttag atggtgtgta tgcatatcga gatataaata gagtgggaat tgatacaaat agaggcatta 

15471 taacattagc gccaggtaaa aatgaattta agattaaagg agacatcagt gatattaaaa ctacatttaa 

15541 gtttcctttt atttataggt aggtgattta atggattatc atgatcattt atcagtaatg gattttaatg 

15611 aattgatttg tgaaaattta ctagatgtag attatggttc ttttaaagaa tattatgaac* tgaatgaagc 

15681 taggtacatc acttttacag tttatagaac tacccataat agttttgttt tcgatttact aatttgtgaa 

15751 aacttcataa tttatcatgg tgaaaaatac acaattaagc agacagcgcc aaaggttgaa ggtgataaag 

15821 tttttattga agttacggca tatcacataa tgtatgaatt tcaaaatcac tcagtggaat caaataagct 

15891 tgatgacgac agtagcgaaa ctggtaaaac gccagaatac tctttagatg agtacttaag atatggattt 

15961 gcaaatcaaa aaacttcggt caaaatgacc tataaaataa ttggaaattt taagcgaaaa gtaccgattg 

16031 acgaattagg taacaaaaac ggcttagaat actgtaaaga agcggtagac ctatttggct gtataattta 

16101 cccaaatgat acggagatat gtttttattc tcctgaaaca ttttatcaaa gaagcgagaa agtgattcga 

16171 tatcaatata atactgatac tgtatctgca actgtcagta cattggaatt aagaacagct ataaaagttt 

16241 ttggaaaaaa gtatacagct gaggaaaaga aaaattataa tcctattaga acaactgaca ttaaatattc 

16311 aaatggtttt ataaaagaag gtacttatcg taccgcaaca attgggtcta aagctactat taactttgat 

16381 tgcaagtatg gtaatgaaac agttagattt acaataaaaa agggctctca aggtggaata tataagttga 

16451 ttttagacgg caagcaaatt aagcaaattt cttgttttgc taagtcggtt cagtctgaaa caatagattt 

16521 aataaaaaac attgataaag gcaagcacgt tttagaaatg atatttttag gagaagaccc caaaaataga 

16591 attgatatat cttcaaataa aaaagctaag ccttgtatgt atgttggaac tgaaaaatca acagtcttaa 

16661 atttaattgc tgacaactca ggtcgcaatc aatacaaagc aattgttgac tacgtcgcag atagtgcaaa 

16731 gcagtttggg attcgatatg ctaatacgca aacaaatgaa gatatcgaaa cacaggataa gctgttagaa 

16801 tttgcaaaaa agcaaataaa tgatactcct aagactgaat tagatgttaa ttatataggt tatgaaaaaa 

16871 tagagccaag agatagcgta ttctttgttc atgaattaat gggatataac actgaattaa aggttgttaa 

16941 acttgatagg tcacatccat ttgtaaacgc aatagatgaa gtgtctttca gcaatgaaat aaaggatatg 

17011 gtacaaattc aacaagcgct taacagacga gctattgcac aagataatag atataactat caagcaaatc 

17081 gtacaaatca tttatacact agtactttga attctccttt cgagacaatg gatataggga gtgtattaat 

17151 ataatggcaa cagaagaagt taaaatcaaa gcgctacttg aaaacgataa acagtacttt ccagctacac 

17221 attggaaagc tataaatggg ataccttacg caggcagtag tgatattgat ggattgcctc aagacggtat 

17291 catttcggta gatgataaaa ataaattaga taatttaaaa ataggcgaag caggaattat tcaaaatagc 

17361 attgtacaga aatccccaaa cggtaaattg tggaaaataa cagttgacga tagtgggaaa cttggtacag 

17431 tgctacttta ttagaaagga aggtgcatta tggaaaattt gtatttaata aaggatttgg gagctttagc 

17501 aggtcgagat tatagagcta aggaaataca aaacttacaa agaatagagc aatttgcgct tggcttgaca 

17571 acagagttta agttgcatca gaaagctaaa acaattcaac acctcgctga gcaaatttat tataatggta 

17641 gatcgcaagc agcagtaaac aaatctttac aaagtcaaat taacgcactt gttgtggcac cacgtaataa 

17711 cagtgctaat gagattgttc aagctcgagt taatgtaaac ggcgaaacct ttgacacatt aaaagaacat 

17781. ttagacgatt gggaaaccca aactcaaatt aataaagagg aaactataag agaattaaat aagaccaaac 

17851 aagaaattct tgatatcgag tatcgttttg aacctgataa gcaagaattt ttatttgtga cagaacttgc 

17921 acctcttaca aatgcagtaa tgcaatcctt ctggtttgat aatagaacag gcatagtata catgacacaa 

17991 gctagaaata atggctatat gctaagtcgt ctaagaccta atggtcaatt tatagacagc tcattgattg 

18061 taggtggggg tcatggtaca cataacggtt atagatatat tgatgatgag ttatggattt atagttttat 

18131 cttaaatggt aataatgaga atacattagt tcgtttcaag tatacgccca atgtggaaat tagctatggc 

18201 aagtatggta tgcaagatgt atttacagga cacccagaaa aaccctacat cacccctgtc ataaatgaaa 

18271 aagaaaataa aattctatac agaattgaga gacctagaag tcactgggaa cttgaaaact caatgaatta 

18341 tatagagata agaagtttag acgatgttga taaaaatatt gataaagttt tgcataaaat cagtatccct 

18411 atgagactaa caaacgaaac ccaaccaatg cagggtgtga cttttgatga aaaatacttg tattggtata 

184 81 caggagacag taatccaaat aatagaaact atttaacggc tttcgattta gaaacaggag aagaagcgta 

18551 tcaggttaat gctgactatg gtggaacact agattcattt cctggcgaat ttgcggaagc agaaggtttg 

18621 caaatatact atgacaaaga tagtggtaaa aaagctttga tgctaggtgt tactgtcggt ggtgatggaa 

18691 atagaacaca tcgtattttc atgattgggc aaagaggtat tttagaaata cttcactcaa gaggcgttcc 

18761 ttttatcatg agtgacacag gtggtagagt taaaccttta ccaatgaggc ctgataaact taagaatctt 

18831 gggatgttaa cagagccagg tctttactat ttatacactg atcatacagt tcaaatcgat gatttcccat 

18901 taccaagaga atggcgtgat gcaggttggt Ccttggaagt taagccacca caaactggcg gtgatgtaat 

18971 tcagatattg acgcgtaata gttatgcaag gaatatgatg acttttgaaa gggtgctttc tggaagaact 

19041 ggagacattt cggactggaa ttatgtgcct aaaaatagtg gtaaatggga gagagtacct tcattcatca 

19111 caaaaatgtc agatattaac atagtaggca tgtcgtttta tttaactacg gatgatacaa aacgttttac 

19181 agattttcca actgaacgta aaggggtagc tggttggaac ttatatgtag aagcttcaaa cacaggtggc 

19251 tttgtccata ggctagttcg taatagtgtt acagcatctg ctgagatact attgaaaaat tatgatagta 

19321 aaacaagttc agggccatgg actttacacg aagggagaat tataagttaa tgagtaattt agagaaatct 

19391 gtagctataa atttagaaaa cacagcgcat tatgaaaata tttcaaatct agatataact tttagaacag 

19461 gagagagtga ttcttctgtt cttcttttta atatcactaa aaataatcaa ccgttattat tgagtgaaga 

19531 aaatatcaaa gcacgaatag cgattcgagg taaaggagtc atggtagttg ctccactaga aatattagat 

19601 ccatttaaag gtattttaaa atttcaatta cctaatgatg taattaaacg agatggaagt tatcaagctc 

19671 aagtttcggt tgcagaatta ggtaattcag acgtggtagc tgtcgagaga actatcacat ttaacgttga 

19741 aaaaagtttg tttagcatga ttccatctga aacaaaatta cactatattg ttgaatttca ggaattagaa 

19811 aaaactatta tggatcgtgc gaaagcaatg gacgaggcta taaaaaatgg tgaagattat gcgagtctga 

19881 ttgaaaaagc taaagaaaaa ggtctatcag atattcaaat agcaaaatct tcaagtatag atgaattaaa 

19951 gcaacttgct aatagccata tatctgattt ggaaaataaa gcgcaagcat attcaagaac attcgatgag 

20021 " caaaagcgat atatggatga gaaacatgaa gccttcaagc agtcagcgaa tagtggtggt ttagtcacaa 

20091 gtggttctac ttcaaattgg caaaaagcta agattactaa agatgatggt aagataatgc agattactgg 

20161 atttgatttt aataatccag aacaaagaat aggtgattca acccaattta tttatgtttc gcaagctata 

20231 aattacccaa gaggtgttag tactaacggt actgtcgaat atttagtagt aacttcagat tacaagcgta 

20301 tgacttatcg accgaacggt acaaataaag tgtttgttaa aagaaaagaa gcgggttcat ggtctgagtg 

20371 gtcagaatta gctattaatg attacaatac accttttgaa actgttcaaa gtgcccaatc aaaagctaat 

20441 atggccgaaa gtaacgctaa attatacgca gatgacaagt ttaataaaag gtattcggtt atttttgatg 

20511 gaacagcaaa tggtgtgggc tctacattgt acttaaatga gagtttagac caatttattt tatta^atttt 

20581 ttatgggact tttccaggtg gtgactttac agagtttggc agtccttttg gaggaggaaa gatttcattg 

20651 aatccctcaa atcttccaga tggtgatgga aatggtggag gtgtttatga gtttggatta actaaatcta 
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20721 gtcgtacatc tttaactata tcaaacgatg 

20791 cgcaaataga gggacaatta acaaaattat 

20861 atgagataat ttcatacgct atcattggtg 

20931 tttctctcaa gtttttagac ctaaagcctt 

21001 tcagaagaaa aagatgactt gcatcaacag 

21071 tcttacgaaa aatggttgct agtatgcaga 

21141 taagcaaaat gcactaatgg caaaacaact 

21211 actgaaaatg cttaaattaa tttcaccaac 

21281 agtaaagaag atatagcgtg gtatgtagat 

21351 gagaaaagta tccagaaaat ctagagtcat 

21421 tggtgtaatg tttggattta ccaaacgaca 

21491 aagactatgt ttgaaaaatt cgacagaata 

21561 tagatagaaa tttcgaagaa ctaaggcgtg 

21631 aaatattaga gacatcaaga tgtggattct 

21701 ttgttaaaaa ctatttttgg catttaaagg 

21771 ttggtcgtgt ttctggttta gtaagtgtaa 

21841 tttggaaaaa aggagcaaac aaatggatgc 

21911 gtaaatcaat tcttagcgaa caaaggtatt 

21981 tactcactgt tgttgcttta Catactacgt 

22051 tcaaaagcta aagaaatata aagctgaaaa 

22121 gtaatgacac ctacgaatat gaacgacaca 

22191 aaccaagcag aaaaatggtt tgataattca 

22261 agtgttacga ttacgcaaat atgtttttta 

22331 taatattcca tttgataata aagcaaggat 

22401 ttaccgcaaa agttggacat tgtcgttttc 

22471 ttgagagcgc taatctaaac actttcacat 

22541 cgttgcgcaa cctggttggg gtcccgaaac 

22611 tttattagat taaatttccc agataaagta 

22681 ctgccaaaaa gcaagcagta attaaaccta 

22751 tggagcagta ggaaacggaa caaacgaacg 

22821 tatttaagac atgccggtca tgaagtcgca 

22891 atacagcata cggtgttaat gtaggtaata 

22961 tgacattgtt ctagaaatac atttagacgc 

23031 agtcaattca atgcagatac tattgataaa 

23101 gaggtgtaac acctcgtaac gatttactaa 

23171 atctgaatta ggttttatca ctaataaaaa 

23241 aaattaatag ccggtgcgat tcatggtaag 

23311 ttaaaaacga aaagaatccg ccagtgccag 

23381 agaaactggt tattacacag ttgccaatgt 

234S1 agaattactg gtgtattacc taataacgca 

23521 gatggattac ttatattgct aatagtggac 

23591 taatagaata agcagttttg gtaagtttag 

23661 cttcggtact tgcctattat ttaaaattaa 

23731 aaacaaacgt ttttagtata taaattattt 

23801 tcaactatat cgtggtttta tgtttattat 

23871 acgggttttt ttcgaaataa tagtaaaaaa 

23941 aaatatttaa ttttattaaa agttaaaaag 

24011 atgattttta tggtcaaaaa aagactatta 

24081 cttcgtttca tgaatctaaa gctgataaca 

24151 aacagaagat acaagtagcg ataagtgggg 

24221 aagtataaca aagacgcttt gattttaaaa 

24291 acaaaaacac agatcatata aaagcaatga 

24361 ccccaatgta gatttaataa attatctacc 

24431 ggttataaca taggtggtaa ttttaatagt 

24501 aaacaattag ttataataaa ataaaaagta 

24571 caggagtttt aggttacata ccatataaat 

24641 tatcaatact cctgtattat tgattttttc 

24711 tttaataatg ttgatttaaa aaatttgaat 

24781 ttccatttat ttttgtttta acagtgtttg 

248 51 taatataacc agaaagttta tgaaattgga 
24 921 aacaacggta aaccagtatt tatagttata 

249 91 aaacctataa ttcagctggt agcgatttcg 
25061 tttaccgtca aacgatgaat tgtatattaa 
25131 ttatatttaa tgaatgaata ctaatctttt 
25201 tggcgcccgg cttttcaaaa cttttgttta 
25271 cgccataaaa ttctcaccac cattcaacgt 
25341 gaatcttctt tggttaactt atctccatct 
25411 gtgacgataa atctttaggt aactcataag 
25481 agtttctttt tttactttgc aattagttat 
25551 gtcttttata ttaaagcgcc acacaggcgc 
25621 ctaaacgaag cgactttgat atcatcatac 
25691 tatctacacg cttgataaga cttactccat 
25761 ttctttctta ataaaagcgt atgttccttg 
25831 caaaaatcag catttgatgg cgtttcgtct 
25901 attcttctcc tatgccagca ccagttgcac 
25971 atctatagaa gtgactttat tctgttcttc 



tctatttcga cttaggaagt caaagaggct ctggtgcgaa 
aggagtgaga aaataatgca aatattagtt aacaagcgta 
gctttgaaga aggtattgat attgaaaatt taccagaaaa 
taaatattca aatggggaaa tagtttttaa cgaagattat 
attgacagtg aagaacaaaa cacagtcgct tctgatgaca 
aacaagttgt tcaaagtaca aagttatcga tgcaagttaa 
tgtgacactt aataaaaaat tagaagaggt taaaggagag 
attcgaagat attaaaacat ggtatcaatt gaaagaatat 
atggaagtta tagataaaga ggaatatgca attattacag 
aggttataat cttatggctt tttaatttga ataaagtggg 
cgaacaagat tggcgtttaa cgcgattaga agaaaatgat 
gaagacagtc tgagaacgca agaaaaaatt tatgacaagt 
acaaagaaga agatgaaaaa aataaagaga aaaatgctaa 
aggattaata gggacgattc taagtacatt tgttatagcc 
aggtgattac catgcttaag ggaattttag gatatagctt 
gtaatagtta agagtcagtg cttcggcact ggctttttat 
aaaagtaata acaagataca tcgtattgat cttagcatta 
agcccgattc cagtagacga tgagaatata tcatcaataa 
ataaagacaa tccaacatct caagaaggta aatgggcaaa 
caagtataga aaagcaacag ggcaagcgcc aattaaagaa 
aatgatttag ggtaggtgtt gaccaatgtt gataacaaaa 
ttagggaagc agttcaatcc tgatttgttt tatggatttc 
tgatagcaac aggcgaaagg ttacaaggtt tatacgctta 
tgaaaaatac gggcaaataa ttaaaaacta tgatagcttt 
ccgtcaaagt atggtggcgg agctggacat gttgaaattg 
cgtttggcca aaattggaat ggtaaaggtt ggacaaatgg 
cgttacaaga catgttcatt attacgatga cccaatgtat 
agtgttggag ataaagctaa aagcgttatt aagcaagcaa 
aaaaaattat gcttgtagcc ggtcatggtt ataacgatcc 
cgattttata cgtaaatata taacgccaaa tatcgctaag 
ttatatggtg gctcaagtca atcacaagac atgtatcaag 
aaaaagatta tggcttatat tgggttaaat cacaggggta 
agcaggagaa agcgcaagtg gtgggcatgt tattatctca 
agtatacaag atgttattaa aaataactta ggacaaataa 
atgttaacgt atcagcagaa ataaatataa attatcgctt 
tgatatggat tggattaaga aaaactatga cttgtattct 
cctatcggtg gtgtgatatc tagtgaggtt aaaacaccag 
caggttatac acccgataaa aataatgtac cgtataaaaa 
taaaggtaat aacgtaaggg acggctattc aactaattca 
acaatcaaat atgacggcgc atattgtatc aatggctata 
aacgtcgtta tattgctaca ggagaggtag acaaggcagg 
tgcagtttga taattgtata tgatgaatct taggcaggta. 
taaacagtta atttttacat gaatatatta aattttaaaa 
tgtgttcgta ttgtgtgcta tgattaaaaa gttgttatgg 
caatcaaaat ataaattatt tataatttgt ttggtaatga 
acacatttgt agatatttta aactcggtaa atcttttaat 
gtttaatata aaaatgtaat aaaatttata aagaaaggaa 
gctgcaacat tgtcgttagg aataatcact cctattgcta 
atattgagaa tattggtgaC ggcgctgagg tagtcaaaag 
ggtcacacaa aatattcagt ttgattttgt taaagataaa 
atgcaaggtt ttatcaattc aaagactact tattacaatt 
ggtggccttt ccaatacaat attggtctca aaacaaatga 
taaaaataaa atagattcag taaatgttag tcaaacatta 
ggtccatcaa caggaggtaa tggttcattt aattattcaa 
ggtgataaga tgactcaatt tctaggggcg cttcttctta 
atctaacaat gataggttta gttagtgaaa aaaacaaggt 
tattgaaaca tgtttgatat ggttttatag ttttataatt 
ttaattcagt tgcttacagg tctaaaagca aatattttgt 
tatttaatcc tttaattgtt aaatttatta tctggttaat 
ttgtataagc ttattagaca aaagagacaa gttgtttaat 
aaagactttg aaaacagaat cattgaagag ggtgaactta 
atttactaga agttgagcga caagatttca aagtatctga 
acatacactt gtagacctta aacaacaaat taaattggat 
ttcttagctt tttctgacaa agtgcttttt aatttttcgc 
ttgggttact acgagtagct tcttgttttt tgtttttatc 
ctacacttgt aggcgttttt ttatttagta aagtcataat 
attttttgtg aaataaactc caagtattta cgcgcattat 
tgaatggttg attaccacta gttaaaactt catatactat 
tttcatcata aacctccttt caaacactgc tgaaatagac 
tgttaatcac aatacaacct tgcccactac tttaatatta 
ttcggattta gagataccaa attaatatag tcttcgcata 
ctaatacaac gagtgcaatt gtaccatctt taatagaatc 
ttttaacata ggttccattg aatcaccatt aactaaaata 
tctttaaaaa atacttcttc atgcaatatg tcatqatata 
cacatgcaat atacgatact agcttagact ctttatatcc 
caattgttca tttgcatagt taagtacgtc ttcttggcgg 
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26041 ggaggtgtga gtttgttgta tatggaagtg 

26111 acaaatcatt aatcttcaca ttgaagtact 

26181 ttcattttcc cattttgata tcttgccttt 

26251 gttgctaatt gttccatagt catattttta 

26321 cgaaaccctc cttatataag ataatttcat 

26391 gcaaaagttg ttgacatcga aacttttatg 

26461 agggggttca atgacaacta gtgtagcaga 

26531 ggaactaacc aaaaagaagt tgctaaagca 

26601 gaattaatgg cagagatttt acaacttcag 

26671 tgattttttt taaactttaa gtttcgaaag 

26741 taacgttaac caaagaagag ttgaaagaaa 

26811 accaatcagc tcaggtgcaa ttttcagtaa 

26881 aaactcaatt tcgcaaaaga tttgtcgcta 

26951 agtatcagca tggcttcgaa tcaattcatc 

27021 attaacatta tcaatttttg gagtgacact 

27091 aaaatttata gagatatcaa aaactattat 

27161 atgatttcga atgaaggagg aactacaaat 

27231 ttaattgacg tgtggcatgg aaatcaatgg 

27301 tctcggatag agaaggtaag aaatatctaa 

27371 ctattgcctt acaatcctaa atcttttctg 

27441 aaatgctgaa atagtcacga gcaacgctat 

27511 ttcgttatga atcttatgtc tatctagagc 

27581 tctaaatcca taaatttcac ctccttccac 

27651 ggaacgacaa atgcaagctc aaaacaaaaa 

27721 ccattagata ttcaaattaa tgacggatat 

27791 aagaaatacc atacgtaaat aataacttat 

27861 ttttgagaaa gatattgaaa agctaatttc 

27931 ctcttttaac ttcgttccaa gttttattgt 

28001 gtcatcaata atccaagaaa cgaccctgcc 

28071 tctaatttta aaagtgagta cattactgtt 

28141 tatgtcgagt aagtggttca cctattttct 

28211 gtgattaagt ttcatcctat cacctccata 

28281 aggaataaca aatgaacatt caagaagcaa 

2 8351 agattggaaa gaaagtcatc gaactaagat 

28421 aatagcgatg ggacaaacct tatcagatat 

28491 aagttataaa cccaactaga gaccaggaat 

28561 aaattgtttt taaactcatt ttcaaagtaa 

28631 tactagcata cacgccgttt aggaacccag 

28701 atgtagtttt tgaaaatact ttgtatgtat 

28771 gaacctaact ttacacattc taaataatct 

28841 gtgtatcaaa ttcatcagat atcaagggca 

28911 aaggagcata aacaaatgaa cacaagatca 

28981 ctgatgcttc ttcatcctat ttaacggaaa 

29051 aaaaagtgat tacagctact tagaaataaa 

29121 gcgaataata acaaacttta acatttatct 

29191 caccagaaaa cacatatcga ggcgaagaaa 

29261 tcaattgttt ggagtatgta gaagtacagt 

29331 gtagaaaatt tatacattga ttattcagca 

29401 tgatcagaaa gcataaaaaa tggtattagg 

29471 agcagtgttg tgcttcacgg tcttagcgat 

29541 attgcgggat tcgcaagtat cgcaacattc 

29611 gctacttgtt ggagcaagta acagtgcaag 

29681 tatgacctta caacaaaaaa tactatcaca 

29751 gaagtttttg ggatatctaa aacacatgca 

29821 aattggaaag ttggggtatc tggcgtgttg 

29891 agagatatta gaagaacaat tcgagttatt 

29961 gaagaacgca tcaagttaat gattcgttta 

30031 agaaggtatt tttgaagaat taaaactatt 

30101 gtagattcat caattgtaca agagaaagtt 

30171 aatcagttga agaagttaag gaaacttctg 

30241 gttccttaaa aaagcagata cttctgataa 

30311 aagctatcta ctatcaaaga agagcattat 

30381 gaagctagat cactcaaata gagctcatgc 

30451 ccaccgagta ttaaggcaag tgaaggtatt 

30521 ctcatgagtt aagtgagtta tatttcagtc 

30591 ttttcaaaat tataagcgaa atcaatatta 

30661 aatgtagaag aaaaatataa cgaagctttg 

30731 tggatttagg taaatacgtc cctgaatctt 

30801 tgaaattatt gaccttaaat acggtaaagg 

30871 tatggcttgg gcgcatatga actgcttagt 

30941 aaccacgaat agataacttt tctactgaag 

31011 tgttaaacca ttagccagac ttgcttataa 

31081 tgtaagataa agcattcatg tagaacacgt 

31151 tgttgagtga tgaagagatt gcagaacttt 

31221 agaaaaatat gcactagatc aagcgaaaga 

31291 cgctcgcgaa gaatgataac tgatacaaat 



atgtcgttat cgtctttgta tgtagtattt gattcactat 
cagccaaaat tttggcagtt gataatcgag gttcttcctt 
cgttaatttc attaagtcgg gatatttatt attaagatca 
tttttttctt agcttcttta aaccttcacc aatacccata 
tataaaagtt tcgaaaacga aacgcaagga aaatattatt 
atgtattctt aaatcaagtt gttacaaacg aaacaaaagg 
taaaccatac ttaaaaataa aaagcttgat tgcacttaaa 
atcggaatga gtagaagttt attgagtata aagataaatc 
aagctaaaaa attagcagat catttaaatg ttaaagttga 
tgacaactaa ataaaaataa ggaggacact atggaacaaa 
ttatagcgaa agaagttaga aatgctataa aaggcgagaa 
agtaagaatc aataatgacg atttagaaga aatcaataaa 
ggaagattga ggaagctcaa tcatccgatt ccgctaaaaa 
aaaaagctta tgtacaagat gttcatgacc atattagaaa 
taattcagac ttgagtgaaa gtgaatacaa cctagcagca 
ttatatatct atgaaaagag agtttcagaa ttaactatcg 
gaaactacta agaaggctat tcaataaaaa acacgaaaac 
ttaaaagtga aagaaagcaa attaaaaaaa tataaagtgg 
ttaaataagc gcacttaatt agtgcaagta atcaagtgcg 
cttttttctt cttcttgtaa tcccaataac acagaagagt 
ctttagcgaa tgcaattacg tcatcaccga cttcttgcca 
tctaggtaat agcgagattg taatatcgtg agcaattttc 
tgggagataa ctaaattata taacaaaaca acttaaagga 
agtcatctat tactactatg acgaagaagg taataggcga 
gaactgatgg tccgatctca tttcatcaac aacaccattg 
atgccttggt tgatggttat gaatttaagt tagattgaat 
cccataagat taagagacat actggatgtt ttgttaacga 
ctctaatatt atcgagaaat tcatggccag accaagtgat 
ttcgatgaat ttcagatcgc aacaaataaa tttagcttct 
tcaaaatcat atttatcaaa aataatatta tcgttgaaat 
tattagattc tatttctaag agcaagagtc taacgcaatc 
acaggagtat agcagaaagg atcataaaca tcttaaaagg 
ctaagatagc tacaaaaaat cttgtctcta tgacacggaa 
attaccaaca aatgatagtt ttttacaatg catcatttca 
tggcaacctt cagccgatga cctcatggca aatgattggg 
tattgaagca attttagaaa tgctatcaat gatacttttt 
acaacagtct tgtctgaaat tgttacatga taaatagtgt 
agtttttaag tttatttaaa tcgtatttta catcttcgaa 
atctttagca cttccaaaat tattgcaggt taatttaacc 
ttgtagagta cggacaagat atattgttgg tctttagtaa 
tgttatcacc tccttaggtt gataacaaca ttatacacga 
gaaggattgc gtataggcgt cccacaagtt tctagcaaag 
aggaacgtaa cttaggagcg gaaatattag agcttattaa 
caaagttttc tatgcattag atagagaact tcaatacagg 
aaaggagtga tagagatgcc aaaaatcata ataccaccaa 
aatttgtgaa aaagttatac gcaacaccta cacaaatcca 
atacaactgg ttgaaatatt accgtgaaga taatttaggt 
acgggaacat tgattaatat ttctaaatta gaagagtatt 
aggattatca aatgagcgac acatataaaa gctacctatt 
tgtactcatg ccgtttctat acttcactac agcatggtca 
atattttata aggaatactt ttatgaagaa taaagaaact 
atgagcaatt gtcttaaata attatataag gagttattaa 
ttttgcaaca tatgacaatt tcaattctga tgatgttgtt 
aaatccacac tttcaagact taagaaaaaa ggaaagattg 
ttgaaccgca gttacattta actgttgtag aacgtaagaa 
ggcaagatta aacgaacaaa gtgatgaccc tagagaaata 
gccaaccaat tttaaggagg agttaatcaa tggcaatatt 
aaataagaat ttacgtgtgc taaatactga actatcaact 
aaagaagcac caatgccaaa agatgaaaca gctcaactgg 
ctgatttaac taaagattat gttttatcag taggaaaaga 
gaaagaattt agaaataaac ttaacgaact tggtgcggat 
gaaaaaattg ttgattttat gaatgcgaga ataaatgcat 
aaagcttagt gcaagtggag caaaacaatg gctaaactgt 
gcagataaaa gttcagtttt tgctgaagaa ggtacattcg 
ttaaatatga aggcctaaca cagtttgagt ttaataaagc 
cagtgaagag ttgcgcgaat atgttgaaga gtacgtagct 
agtagagatg acgatgtaat agctttattt gaaacaaaat 
ttggtactgg tgatgtcatt atattttcag gtggtgtact 
cattgaagtt tcagctatag ataatcctca acttagatta 
ttaatgtatg acattcatac agttcgcatg actatcatac 
agttaccaat atcaagatta cttcaatggg gaaccgattt 
cggtgaaggt gagtttaaag caggtagtca ttgtagattc 
gcagaataca tgcaaaatgt gcctcaaaag ccaccacatt 
tatataaact gcctgacatc aaaaaatggg ctgatgaagt 
aaatgataaa aactattctg gttggaagct tgtagaaggt 
gcaacgcttg aaaagttagt tgaagcaggt tataaacctg 
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3X361 aagatattac agaaaccaag ttacttagca ttacgaattt agaaaaatta atcggcaaaa aagcattttc 

31431 taaaattgca gaaggcttta tagaaaagcc acaaggtaaa ttaacacttg ctaccgagtc tgataaacga 

31501 ccagctataa agcaatctgc tgaagatgat tttgacaaac tataaaaatt aaaaaggacg gtatataaac 

31571 atgaaagcaa aagtattaaa taaaactaaa gcgattacag gaaaagtaag agcatcatat gcacatattt 

31641 ttgaacctca cagtatgcaa gaagggcaag aagcaaagta ttcaatcagt ttaatcattc ctaaatcaga 

31711 tacaagtacg ataaaagcca ttgaacaagc tatagaagct gctaaagaag aaggaaaagt tagtaagttt 

31781 ggaggcaaag ttcctgcaaa tctgaaactt ccattacgtg atggagatac tgaaagagaa gatgatgtga 

31851 attatcaaga cgcttatttt attaacgcat caagcaaaca agcacctggt attattgacc aaaacaaaat 

31921 tagattaacg gattctggaa ctattgtaag tggtgactat attagagctt caatcaattt atttccattc 

31991 aacacaaatg gtaataaggg tatcgcagtt ggattgaaca acattcaact tgtagaaaaa ggcgaacctc 

32061 ttggcggtgc aagtgcagca gaagatgatt tcgatgaatt agacactgat gatgaggatt tcttataagt 

32131 caataggtgg ggtttttagc cccactttaa ttttaaagaa attgaggtgt caagaatttg aaatttatga 

32201 atatagatat tgaaacatat agcagtaacg atatttcgaa atgtggtgtc tataaataca cagaagctga 

32271 agatttcgaa atcttaatta tagcttattc aatagatggt ggaccgatta gtgcgattga catgactaaa 

32341 gtagataatg agcctttcca cgctgattat gagacgttta aaattgctct atttgaccct gctgtaaaaa 

32411 agtatgcatt caatgctaat ttcgaaagaa cttgtcttgc taaacatttt aataaacaga tgccacctga 

32481 agaatggatt tgcacaatgg ttaattcaat gcgtattggc ttacctgctt cgcttgataa agttggagaa 

32551 gttttaagac tacaaaacca aaaagataaa gcaggtaaaa atttaattcg ttatttctct ataccttgta 

32621 agccaacaaa agttaatgga ggaagaacaa gaaatttgcc tgaacatgat cttgaaaaat ggcaacaatt 

32691 tatagattac tgtattcgag atgtagaagt agaaatgaca attgctaata aaattaaaga ctttccagta 

32761 actgtaattg aacaagcata ttgggttttt gaccaacata taaacgacag aggtattaag ctttctaaat 

32 831 cattgatgtt aggagctaat gtgctcgata agcagagtaa agaagaattg cttaaacaag ctaaacatat 

32901 aacaggttta gaaaatccta atagtcctac acagttattg gcttggttaa aggatgaaca aggattagat 

32 971 atacctaatt tacaaaagaa. aacggttcag gattacttaa aagtagcaac aggaaaagct aaaaaaatgc 
33041 tagaaattag attgcaaatg tctaaaacca gtgtgaaaaa atacaacaaa atgcatgaca tgatgtgcag 
33111 tgatgaacgg gtaagaggtc tgtttcaatt ctacggtgcc ggtactggaa gatgggcagg tagaggtgta 
33181 caacttcaga atttaacaaa gcattatatt tcagatactg aattagaaat agcaagagat cttattaaag 
33251 aacaacgttt tgacgattta gatttattac tcaatgttca tcctcaagac ttattaagtc aattagttag 
33321 gacgacattt actgctgaag aaggtaatga actagcagta agtgattttt ctgcaataga ggcaagagtc 
333 91 atagcatggt atgcaaaaga acaatggcgt ttagatgtgt tcaacacaca cggaaagata tatgaagcat 
33461 cggcttctca aatgtttaat gtaccggtag aaagcataac taaaggcgac cctctcagac aaaaaggaaa 
33531 agtgtccgaa ttagctttag gctatcaagg tggcgctgga gctttaaaag caatgggtgc attggaaatg 
33601 ggcattgaag aaaacgagtt acaaggttta gttgatagtC ggcgtaacgc aaatcctaac atagttaatt 
33671 tttggaaggc ttgccaagag gctgcaatta atactgtaaa atcccgaaag acgcatcata cacatggact 
33741 tagattttat atgaaaaaag gttttctaat gattgaactg cctagtggaa gagctttagc ttatccaaaa 

33 811 gctttagttg gtgaaaatag ttggggtagt caagttgttg aatttatggg gttagatctt aaccgtaaat 
33 881 ggtcaaagtt aaaaacgtat ggtgggaagt tagtcgagaa tattgttcaa gcaactgcaa gggatttact 

33 9S1 tgcgatttct atagcaaggc ttgaagcatt aggttttaaa atagttggcc atgtccatga tgaagtaatt 
34021 gtagaaatac ctagaggttc aaatggactt aaggaaatcg aaactatcat gaataagcct gttgattggg 
34091 caaaaggatt gaatttgaat agtgacgggt ttacttctcc gttttatatg aaggattagg agtgtgattg 
34161 catgcaacat caagcttata tcaatgcttc tgttgacatt agaattccta cagaagtcga aagtgttaat 
34231 tacaatcaga ttgataaaga aaaagaaaat ttggcggact atttatttaa taatccaggt gaactattaa 
34301 aatataacgt tataaatatt aaggttttag atttagaggt ggaatgatgg ctagaagaaa agttataaga 

34 371 gtgcgtatca aaggaaaact aatgacattg agagaagttt cagaaaaata tcacatatct ccagaacttc 
34441 ttagatatag atacaaacat aaaatgcgcg gcgatgaatt attgtgtgga agaaaagact caaaatctaa 
34511 agatgaagtt gaatatatgc agagtcaaat aaaagatgaa gaaaaagaga gagaaaaaat cagaaaaaaa 
34581 gcgattttga acctatacca acgaaatgtg agagcggaat atgaagaaga aagaaagaga agattgagac 
34651 catggcttta tgatggaacg ccacaaaaac attcacgtga tccgtactgg ttcgatgtca cttataacca 
34721 aatgttcaag aaatggagtg aagcataatg agcgtaatca gtaacagaaa agtagatatg aacgaagcgc 
34791 aagacaatgt taagcaacca gcgcactaca catacggcga cattgaaatt atagatttta tcgaacaggt 
34 861 tacggcacag tatccacctc aactagcatt cgcaataggt aatgcaataa aatacttgtc tagagcacct 
34 931 ttaaagaatg gtcatgagga tttagcaaag gcgaagtttt acgtccaaag agcttttgac ttgtgggagt 
35001 gatgaccatg acagatagcg catgtaaaga atacttaaac caatttttcg gatctaagag atatctgtat 
35071 caggataacg aacgagtggc acatatccat gtagtgaatg gcacttatta ctttcacggg catatcgtac 
35141 caggctggca aggcgtgaaa aagacatttg atacagcgga agagctcgaa acatatataa agcaacatgg 
35211 tttggaatac gaggaacaga agcaactaac tttattttaa ggagatagaa atgatgaaaa tcaaagttga 
35281 aaaaataatg aaaatagacg aattaattaa gtgggcgcga gaaaatccgg agctatcatt tggcagaaaa 
35351 tattatacaa cagacaaaaa tgatgaaaac tttatttact tcggtgtttt taaaaattgt tttaaaataa 
35421 gcgattttat attagttaat gctactttta gtgtcaaagt tgaagaagaa gtaaccgaag aaactaagtt 
354 91 tgataggttg tttgaagtgt acgagattca agaaggagtc tataaatctg catcatatga gaatgctagt 
35561 ataaacgaac gtttaaaaaa tgacagaatt tttcttgcta aagcattcta catcttaaac gacgacctaa 
35631 ctatgacgtt aatttggaaa gaaggagagt tgattaaata atggaacacg gttcaaaaga atattacgaa 
35701 aagcaaagtg aatactggtt tgatgaagca agcaagtttt tgaagcaacg tgatgagctt attggagata 
35771 tagctaagtt aagagagtgc aacaaagagc tggagaagaa agcaagcgca tgggataggt attgcaagag 
35841 cgttgaaaaa gatttaataa acgaatttgg caaagatggt gaaagagtta aatttggaat ggaattaaac 
35911 aataaaattt ttatggagga agacgcaaat gaataaccgc gaacaaatcg aacaatcagt tattagtgct 
35981 agcgcgtata acggcaatga cacagaggga ttattaaaag agattgagga cgtgeataag aaagcgcaag 
36051 cgtttgatga aatacttgag ggtttaccta atgctatgca agatgcaatc aaagaagata ttggtcttga 
36121 tgaagcagta ggaattatga cgggtcaagt tgtctataaa tatgaggagg agcaggaaaa tgactaacat 
36191 attacaagtg aaactatcat caaaagacgc tagaatgcca gaacgaaatc ataagacgga tgcaggttat 
36261 gacatatttt cagctaaaac tgtcgtactt gagccacaag aaaaggcagt gatcaaaaca gatgtagctg 
36331 taagcattcc agagggctat gtcggtttat taactagccg tagtggtgta agtagtaaaa cgcatttagt 
36401 gattgaaaca ggcaagatag acgcgggata tcatggtaat ttagggatta atatcaagaa tgataatgaa 
36471 acgttagaga gtgaggatat gagtaacttt ggtcggagtc cttctggtat agatggaaaa tacaqcctac 
36 541 tacctgtaac agataaattt ttatgtatga atggtagtta tgtcataaat aaaggcgaca aactagctca 
36611 attggttatc gtgcctatat ggacacctga actaaagcaa gtggaggaat tcgagagtgt ttcagaacgt 
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36681 ggagcaaaag gcttcggaag tagcggagtg taaagacata ttagatcgag tcaaggaggt tttggggaag 

367S1 tgagtgacat gttagaaata tttttcatag ggtttggtgt ttatctattt tgtcgcatag gtattatttt 

3S821 tctcaagagt aaaaagacta tacacacaaa cctatatgaa atgttgttga ttgctactat ctttgtgaca 

36891 tctacatttg ctgataaaca tcaaaagacg catatcttaa tagcattttt agtaatgttt tttatgagta 

36961 agctcaaaca agttcaaggg agctatgagg aatgacacaa tacctagtca caacatttaa agattcaaca 

37031 ggacgtaagc atacacacat aactaaagct aagagcaatc aaaggtttac agttgttgat gcggagagta 

37101 aagaagaagc gaaagagaag tacgaggcac aagttaaaag aaatgcagtt attaaattag ggcagttgtt 

37171 tgaaaatata agggagtgtg ggaaatgact aaacaaatac taagattatt attcttacta gcgatgtatg 

37241 agctaggcaa gtatgtaact gagcaagtat atattatgat gacggctaat gatgatgcag aggcgccgag 

37311 tgactttgaa aaaatcagag ctgaagtttc atggtaatag ctattatcat ttttgaatta attatattaa 

37381 tgtgtttagc aatagcactg gaggtgttgt aaatatgtgg attgtcattt caattgtttt atctatattt 

374 51 ttattgatct tgttaagtag catttctcat aagatgaaaa ccatagaagc attggagtat atgaatgctt 

37521 atcttttcaa gcagttagta aaaaataatg gtgttgaagg tatagaagat tatgaaaatg aagttgaacg 

37S91 aattagaaaa agatttaaaa gctaaagaga ggcgttggct tctctgttct atttaaaata atgaaaggag 

37661 ccgaacatgt tagacaaagt cactcaaata gaaacaacta aatatgatcg tgatgtttca tattcttatg 

37731 ctgctagtcg tttatctaca cattggacta atcacaatat ggcttggtct gactttatgc agaagctagc 

378 01 acaaacagtt agaactaaag aagatttaac tgagtacaat aaaatgtcta agtctgaaca agccgatata 

37871 aaagatgttg gcggatttgt cggtggttat ttaaaagaag gcaaacgacg tgctggtcaa gtcatgaatc 

37941 gttcaatgtt aacacttgat atcgattatg ctgctcaaga tatgactgac atattatcta tgttttatga 

38011 ttttgcatat tgtttatatt caacacataa gcatagagag ataagtccaa gactgcgttt agtgattcct 

38081 ttaaaacgaa atgtaaatgc agatgagtat gaagctattg ggcgtaaagt cgcagatatc gttggcatgg 

38151 attacttcga tgatacaact tatcaaccac ataggttaat gtactggcct tcaactagta acgatgcgga 

38221 atctttcttt acctatgaag atttaccttt gttagaccca gataaaatat taaatgaata tgttgattgg 

382 91 actgacacat tagaatggcc aacgtcttca agggaagaga gtaagactaa aagattagca gataagcaag 

38361 gcgacccaga agaaaagccg ggaattgttg gtgcattttg tagagcctat acgatagaag aagctataga 

384 31 aacttttatt cctgacttat acgaaaaaca ttctactaac cgttatacct atcatgaagg ttcaactgca 

38501 ggtggattgg tgttatacga aaataacaag tttgcctatt ctcatcataa tacggatccc gtaagcggta 

38571 tgctcgtgaa cagttttgat ttagtacgca tacacttata tggtgctcaa gatgaagacg ctaaaacaga 

38641 tactccggtt aatcgactac ctagttataa agcaatgcag caaagagcgc aaaatgatga agttgttaaa 

38711 aagcaattaa ttaacgacaa aatgtctgat gcaatgcagg atttcgatga aatagtaaat agcgatgatg 

38781 catggtctga gacgttagaa attacttcga aaggtacttt caaagctagt atcccaaata tagaaattat 

38851 attgcgtaat gatccaaatt taaaaggaaa aatagcattt aatgaattta caaaacaaat tgaatgctta 

38921 gggaaaatgc catggaataa taattttaaa atacgtcaat ggcaagacgg tgatgatagc agtttaagaa 

38991 gttatatcga aaagatttat gacatacacc attcaggcaa aacaaaagat gccattataa gcgtagcaat 

39061 gcaaaatgcc tatcatccag taagagatta tctaaataaa atatcgtggg atggacataa acgtcttgaa 

39L31 aagttattta tcaaatactt aggtgttgaa gacactgaag tgaatagaac aactaccaaa aaggcattga 

3 9201 ctgctggaat cgctcgagta atggagccag gatgtaaatt tgactatatg cttacacttt atggtcctca 

39271 aggtgtaggt aaatctgctt tgctaaaaaa aataggtggt gcatggtttt ctgacagttt agtttctgtt 

39341 actggtaagg aagcatatga ggcattacaa ggcgtttggt taatggaaat ggcagaactt gcagctacaa 

39411 gaaaagctga agttgaagct attaagcatt tcatatctaa acaagttgac cggtttcgtg ttgcttatgg 

39481 acattatatt gaagattttc caaggcaatg tattttcatt ggtacaacta ataaagttga tttcttaaga 

39551 gatgaaactg gtggaagacg tttttggcca atgactgtaa atccagagag agttgaagtg aactggtcta 

3 9621 aactaaccaa agaagagatc gaccaaatct gggcagaagc taaatactat tatgaacaag gagaagagtt 

3 9691 gttccttaac cctgaactag aagaagaaat gcgttcaatc caaagtaaac atactgagga atctccatat 

3 9761 acaggtatta ttgatgaata tcttaacacg ccaatcccaa gcaattggga agacttaact atctttgaaa 

3 9831 gaagacgatt ttatcaaggt gatgttgata tgttaccaac aggaaatgta gattacattg aaagagacaa 

3 9901 ggtctgtgcg cttgaagtgt ttgttgaatg ttttggtaaa gataagggag atagtagagg atctatggaa 

3 9971 attagaaaga tttctaacgt cttaagacaa ttagacaatt ggtctgtata tgaaggcaat aaaagtggga 

4 0O41 aaattcgatt tggaaaagat tatggtgtac agatagcgta tgtaagagat gaaagtttag aggatttaat 
4 0111 ataagaaata ttgaataaat atacattttt agatgttgta tcaaatgttg catcattttt tgagtgatgc 
40181 aacacggtgg tgtaaaaagt aatcgtaggt gttgtatcat ttttggtgat gcaacattga tgcaacaaat 
4 0251 gatacaacac ctctttccct tctcgctgta aggttcaacc ctgtttgttt ccaatgttgc atcaaattca 
4 0321 ctataaagtt taaaaagtag tgttagggag taaaggggta taggggtaac cctctaacag ctatttttaa 
4 0391 aagtttggca agaattgatg caacatcgga acacaaatat aaattttgta tacaaggtga ataaatgaaa 
4 0461 gaatcgacat tagaaaaata tttagtgaaa gagataacaa agttaaatgg attatgttta aaatgggtcg 
4 0531 cacctggaac aagaggtgta ccagatagaa ttattattat gccagaagga aaaacatatt ttgtagaaat 
4 0601 gaagcaagaa aagggaaagt . tacatccttt acaaaaatat gtgcatcggc aatttgaaaa cagagatcat 
4 0671 acagtgtatg tgttatggaa taaagaacaa gtaaatactt ttataagaat ggtaggtgga acatttggcg 
4 0741 attgatttca aaccacatag ctatcaaaag tatgcaatag ataaagtgat tgataatgag aaatacggtt 
4 0811 tgtttttaga tatggggcta gggaaaacag tatcaacact tacagcattt agtgaattgc agttgttaga 
4 0881 cactaaaaaa atgttagtca tagcacctaa acaagttgct aaagatacat gggttgatga agttgataag 
4 0951 tggaaccatt taaatcatct gaaagtgtct ttagtcttag gaacacctaa agaaagaaat gatgcattaa 
41021 acacagaggc tgatatctat gtaaccaata aagaaaatac taaatggtta tgtgatcaat ataaaaaaga 
41091 atggccattt gacatggttg taattgatga actgtctaca tttaaaagtc ctaagagtca aaggtttaaa 
41161 tctattaaaa agaaattacc actcattaat agatttatag gattaacagg aacacctagt ccaaatagtt 
41231 tacaggattt atgggctcaa gtttatttga tagacagagg cgaaagactt gagtcttcat tcagtcgtta 
41301 tcgagaaagg tactttaaac caacacatca agttagcgaa catgttttta actgggagct aagagacgga 
41371 tctgaagaaa agatatatga acgaatagaa gatatatgtt taagcatgaa agcgaaagat tatctggata 
41441 tgcctgacag agttgatact aaacaaacag tagtcttatc tgaaaaagaa agaaaagtat atgaagaatt 
41511 agaaaaaaac tatattttag aatcggaaga agaaggaaca gttgtagctc agaatggggc atcattaagt 
41581 caaaaactac ttcaactatc taacggtgca gtttatacag atgatgaaga tgtaagactt atacatgata 
41651 agaagttaga taagttagag gaaattatag aggagtctca aggccaacca atattattgt tttataactt 
41721 caaacatgat aaagaaagaa tacttcaaag gtttaaggaa gcaaccacat tagaggattc aaactataaa 
41791 gaacgttgga atagtggaga cattaagctg cttatagcac atccagcaag tgcagggcat ggatt;aaact 
41861 tacaacaagg tgggcacatt attgtttggt ttggacttac atggtcattg gaattatacc aacaagcaaa 
41931 tgcaagatta tatagacaag gacaaaatca tacgactatt attcatcaca tcatgaccga taacacaata 
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42001 gatcaaagag tatataaagc tttacaaaat 

42071 caagaatagc taagcataag taatggaggt 

42141 atttaaatat attgaatcag aaatatataa 

42211 gagatactta acccaacgaa agaactagac 

422 81 ttagaacaac tgagttaatg gcgacaaggt 

423 51 tgaagcagtt gaaagtgagt acttaaagtt 
4 2421 aataaagata agaagctaaa gatagaacaa 

424 91 caatacgaaa gaactttgtt aaagcgatag 
42561 gcaaaaggcc tacaaatctg tagtaatatg 
42631 cgacataaat acatgaggca catcgctaag 
42701 tgaccaagca taataacatt tataagcatg 
42771 agcatggaag aagttaagag agatagcatt 
42841 gatattataa cagatgcaaa gattgtgcat 
42911 acttagataa tctaatgtca gtttgttata 
42981 taatcttaag aaaattagag ttctaaaaat 
43051 tgcccatcgg cttaaaatgt tttttcgccg 



aaagaactaa cgcaagaaga attgatgaaa gctattaaag 
ataagatggg aaaggcgtca tatgatatta agccaggaac 
tttaaatgag aacaagaaag agataaatag attgagaatg 
accaacattg tgtatggacc gttacaaaaa ggagagccag 
tattgactaa taagatgtta cgtaacttag aagagatggt 
acctgaagat cataagaaag taataaggtt aaagtattgg 
ataggggatg cttgtcacat gcatcgcaat acagttacta 
cgtatcatgc aggtatcaaa taacattgtg caaagattgt 
atagtatcgg aaagatgtat aaagttatct gaaagttata 
cggtgtgtct tttgttatgc aatcaaagag gtgtaagaga 
gtcgtaagtc atatcaatac gattggttct atcattcaaa 
agatagagat aattatcttt gtcaaatgtg tttacgcgaa 
cacattattt atgttgatga agattttaac aaagctttag 
gctgtcataa caaaattcat gcaaatgata atgacaaaag 
ttaaataaaa aaactattta aataaaattt tatgcccccc 
ggtaccggag aggcc 
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Table 8 



Bacteriophage 3A ORFs list 



SID 


LAN 


FRA 


FOS 


a. a. 


RBS sequence 


STA 


STO 


100379 


3AORF001 


1 


8515. .13488 


1657 


*acaggtacggat ttaagaaaactt t 


ttg 


taa 


1003 80 


3AORF002 


2 


37667. .40114 


815 


t t t aaaat aatgaaaggagccgaac 


atg 


taa 


1003 81 


3AORF003 


1 


32188 . .34149 


653 


t t aaagaaat tgaggt gt caagaat 


ttg 


tag 


1003 82 


3AORF004 


3 


17457. .19370 


637 


gctattttattagaaaggaaggtgc 


att 


taa 


100383 


3AORF005 


1 


334 . .2034 


566 


agaaaaaagatagt tcaagaagaag 


gtg 


taa 


1003 84 


3AORF006 


1 


15571 . .17154 


527 


cttttatttataggtaggtgattta 


atg 


taa 


1003 8 5 


3AORF007 


2 


19337 . . 20836 


499 


atgatagtaaaacaagttcagggcc 


atg 


taa 


100386 

J. W V J W 


3AORF00 8 


3 


22176 . .23630 


484 


aatgatttagggtaggtgttgacca 


atg 


tga 


100387 


3AORF009 


1 


40726 . .42093 


455 


gtaaatacttttataagaatggtag 


9tg 


taa 


100388 

■X w V J Q O 


3AORF010 


3 


13491 . .14738 


415 


gaggcggactaacgctacagtaaaa 


att 


taa 


1003 89 


3AORF011 


2 


2039. .3277 


412 


attaaagacataatgcgttaaggag 


gtg 


taa 


1003 90 


3AORF012 


2 


4001 . . 5209 


402 


aaaaaagagaaaaaat taaacgcga 


atg 


taa 


1003 91 


^ AORF01 1 

J *% w ur u 1 J 


x 


30379 . .31545 


388 


att t tatgaatgcgagaataaatgc 


atg 


taa 


100392 


3AORF014 


2 


14738 . . 15562 


274 


attatatgggaggtt tgactaatta 


atg 


tag 


1003 93 




3 


3249 . . 4034 


261 


cttgaattaagaaaatctttgaaag 


gtg 


tag 




•a ADRFO 1 6 


_2 


25587 26273 


228 


aagaagct aagaaaaaaataaaaat 


atg 


tga 


100395 


"5AORF017 


3 


6729 . . 7370 


213 


tt aattt ttaaggaggaaataagca 


atg 


taa 


H 10Q3 96 


1AORF0 1 8 
j nuivf vr x o 


3 


24540 . .25154 


204 


aat aaaat aaaaagtaggtgataag 


atg 


taa 


100 3 9 7 


3AORF019 


2 


31565 . . 32128 


187 


ct ataaaaat taaaaaggacggt at 


ata 


taa 


1003 9 8 


\ AfiRFO? 0 


3 


36150 . .36713 


187 


gcagtaggaattatgacgggtcaag 


ttg 


taa 


1003 99 


3AORF021 


2 


24011 . .24535 


174 


gt aat aaaat ttataaagaaaggaa 


atg 


tga 


1. U U ** v \f 


' "IAORF022 


-2 


12423 . . 12938 


171 


taaagtaccagtagacaatgtaggt 


att 


tga 


10040 1 


3AORF023 


1 


7462 . . 7917 


151 


aaaat aaat caaaggagaat aat 1 1 


atg 


taa 


100402 


3AORF024 


1 


26731 . .27174 


147 


actaaataaaaataaggaggacact 


atg 


tga 


100403 


1 AORF025 


1 


42106 . .42543 


145 


taagcataagtaatggaggtataag 


atg 


taa 


100404 


3AORF026 


2 


35255 . .35671 


138 


aagcaactaactttattttaaggag 


ata 


taa 


100405 


3AORF027 


2 


5888 . .6298 


136 


atattggctataatacagtggtttt 


ate 


taa 


100406 


3AORF02 8 


-3 


27845. .28255 


136 


ccttttaagatgtttatgatccttt 


ctg 


taa 


100407 


3AORF029 


3 


34344 . ,34748 


134 


1 1 aagg 1 1 1 1 aga 1 1 1 agagg t gg a 


atg 


taa 


100408 


3AORF030 


2 


6299. .6694 


131 


tataaaaaaggagttggccagataa 


atg 


tag 


100409 


3AORF031 


1 


20833. .21225 


130 


ttaacaaaattataggagtgagaaa 


ata 


taa 


100410 


3AORF032 


-2 


39984. .40361 


125 


aaatagctgttagagggttacccct 


ata 


tag 


100411 


3AORF033 


1 


7957. .8325 


122 


gaatatctgcgtcttttttatttga 


ata 


taa 


100412 


3AORF034 


-2 


28506. .28871 


121 


gttatcaacctaaggaggtgataac 


atg 


tag 


100413 


3AORF035 


-2 


10671. .11036 


121 


tcctagcttcctaacagcaccgcca 


ata 


tga 


100414 


3AORF036 


2 


30020. .30382 


120 


accaat tttaaggaggagttaatca 


atg 


tga 


100415 


3AORF037 


2 


21818. .22165 


115 


aagtgtaagtaatagttaagagtca 


gtg 


tag 


100416 


3AORF038 


-2 


42003. .42347 


114 


gtactcactttcaactgcttcaacc 


ate 


tga 


100417 


3AORF039 


2 


21386. .21727 


113 


tccagaaaatctagagtcataggt t 


ata 


taa 


100418 


3AORF040 


-3 


29654. .29995 


113 


t tgattaactcctcctt aaaat tgg 


ttg 


taa 


100419 


3AORF041 


-1 


4333. .4671 


112 


tactaaatctacatctgatccatga 


att 


tga 


100420 


3AORF042 


3 


5568. .5900 


110 


taaaaaagtggtaggtgatttttaa 


atg 


tga 


100421 


3AORF043 


1 


25690. .26019 


109 


taccaaattaatatagtcttcgcat 


ata 


tag 


100422 


3AORF044 


3 


29676. .30005 


109 


gtcttaaataattatataaggagtt 


att 


taa 


100423 


3AORF045 


3 


30. .353 


107 


cgctagcaacgcggataaatttttc 


atg 


taa 


100424 


3AORF04 6 


3 


27894. .28214 


106 


aagatattgaaaagctaatttcccc 


ata 


tga 


100425 


3AORF047 


-2 


11907. .12227 


106 


ttcgccgccaaaatgattagcattt 


ctg 


tga 


100426 


3AORF048 




40343. .40663 


106 


ccataacacatacactgtatgatct 


ctg 


taa 


100427 


3AORF049 




6749. .7069 


106 


tgttaaaccatcttcagattctcca 


ata 


taa . 


100428 


3AORF050 


1 


42700. .43014 


104 


ttatgcaatcaaagaggtgtaagag 


atg 


taa 


100429 


3AORF051 




13077. .13388 


103 


ttgtacgtaatcccacacatcgccg 


att 


tga 


100430 


3AORF052 




3722. .4024 


100 


gcatttcatttcctcctaataactc 


att 


tga 


100431 


3AORF053 


3 


17145. .17444 


99 


tcgagacaatggatatagggagtgt 


att 


tag 


100432 


3AORF054 




19915. .20211 


98 


ataatttatagcttgcgaaacataa 


ata 


tga 


100433 


3AORF055 




42436. .42729 


97 


aatcgtattgatatgacttacgacc 


atg 


tag 


100434 


3AORF056 


3 


40455. .40745 


96 


taaattttgtatacaaggtgaataa 


atg 


tga 


100435 


3AORF057 




38665. .38952 


95 


atcatcaccgtcttgccattgacgt 


att 


taa 


100436 


3AORF058 




21265. .21549 


94 


gaaatttctatctaacttgtcataa 


att 


tga 


100437 


3AORF059 


-2 


10278. .10562 


94 


tttagccgcgcttccaactgcacgt 


att 


tag 


100438 


3AORF060 


1 


5278. .55S6 


92 


atatcagccgaataggggtgatgaa 


atg 


tag 


100439 


3AORF061 


1 


35668. .35946 


92 


tttggaaagaaggagagttgattaa 


ata 


taa 


100440 


3AORF062 


2 


35912. .36187 


91 


gttaaatttggaatggaattaaaca 


ata 


taa 
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100441 


3AORF063 


3 


36720 - .36995 


9 1 


y*. a -» jmm fc- ■— » f*t f~m J~m 3 *ftr *. /ft- ►aJS /ft» 3 T 3 

CggaagtayCgyaytgtoaoyataL 


att 


taa 


100442 


3AORF064 


-2 


*i c c a a icoco 
35694. . J b y o y 


y l 


/*»f™rr*" t* a ** a/'nr'rT/^h a<-Ti-"a<"*tA3.tAA 
CCy LLaLaCytytl. eiy Low u a. a. u cxa 


eta 


taa 


100443 


3AORF065 


- 2 


32697 . .32972 


Q 1 
71 


daCCyC LuCClCt tyLadoi. uayyt 


ata 


taa 


100444 


3AORF066 


3 


2 9157 . .29429 


y u 




ota 


taa 


100445 


3AORF067 


- 2 


26661 . . 2by JU 


Q Q 

o y 


aiaCuutu tLaycyyaaL.tyya^y« 


tta 

u uy 


taa 


100446 


3AORF068 


- 2 


70 24 . . y o y J 


O J 


hthhaahrir , Ahr , tr , r , r , Af'atA.tt a A 


ata 


taa 


100447 


3AORF069 


-3 


1384 7 . . 14 11U 


n t 
o / 


t rii^a tthr*r , hr»r , hnAthf*ataftCTA 


ate 


tqa I 


100448 


3AORF070 


1 


34 993. .35250 


a c 
O 3 


LttaCytCCaadyayL Lit.i_yjat.i_i. 


ata 

y v.y 


taa 


100449 


3AORF071 


2 


34 /45 . - JbUU Z 


O 3 


daatyc LLddydaai»yyayLyaayv. 


ata 


tqa 


100450 


3AORF072 


-1 


27379 . .27636 


8 5 


CttgCCg tCCCLCCtCCodyLty tt 


tty 


taa 


100451 


3AORF073 


2 


■ft ft *T ft ft> ft c 

37367 . .37615 


o 2 


K <-*i»rfr" a a K a r^rt h a fr» f- a t" /"* a t t t t t Of A 
Cyy taaLdyCLdL LdLLdL LLLLya 


att 


taa 


100452 


3AORF074 


-2 


23466 . . 23714 


82 


cgt t tgt t ttt c caaaac c caacac 


att 


taa 


100453 


3AORF075 


•3 


2471 . . 2719 


8 2 


agcaccgLLLgaaacccccLddcdc 


tty 


taa 

uya 


100454 


3AORF076 


1 


26047 . . 26292 


8 1 


aagtacgttt tcttggcggggaggt 


gtg 


cay 


100455 


3AORF077 


2 


28292 . . 28537 


81 


aacatcttaaaaggaggaataacaa 


3^n 

acg 


tag 


100456 


3AORF078 


-1 


5836 . . 6075 


79 


ttttgtataaggcttagatttagtc j 


att 


\r a a 
caa 


100457 


3AORF079 


-2 


5460 . . 5699 


79 


attcagtcgcttttaaaatttctct 


at-/"* 


taa 


100458 


3AORF080 


-2 


31350 . . 31586 


78 


cctgtaatcactttagttttattta 


a a 

aca 


caa 


100459 


3AORF081 


-3 I 


8252 . . 8488 j 


78 


aagttttcttaaatccgtacctgta 


a h n 
awy 


tga 


100460 


3AORF082 


-1 


35905 . .36138 


77 


atatttatagacaacttgacccgtc 


a f- a 

aca 


taa 


100461 


3AORF083 


-1 


34039 . . 34272 


77 


at age ccacccggac La c caa.au aa 


ata 


tga 


100462 


3AORF084 


-1 


12007 . .12240 


77 


,_,_«.--_**wwwwW_ b »,i»*^fr*^V >_> ^ 0 3 

acattt etc teat ttcgccgccaaa 


atg 


taa 


100463 


3AORF085 


-2 


32367 . .32597 


76 


cttacaaggtatagagaaataacga 


acc 


taa 


100464 


3AORF0 86 


-2 


30618 . . 30848 


76 


atataatct aagt tgaggac catct 


ata 


taa 


100465 


3AORF087 


-3 


24746 . . 24973 


75 


ataggt 1 1 taagt t caccct ct t ca 


atg 


tga 


100466 


3AORF088 


-3 


12980 . . 13204 


74 


tctttctttttcgtaccaccatgga 


3K- 

acc 


f* art 

cay 


100467 


3AORF0 89 


3 


4290 . .4508 


72 


acaggagaagct t at caat ct t taa 


acg 


t aa - 


100468 


3AORF090 


3 


28926 . .29141 


71 


ttatacacgaaaggagcataaacaa 


acg 


taa 


100469 


3AORF0 91 


-2 


13587. .13802 


71 


cttgtcttgctaattgcttagataa 


atg 


t ag 


100470 


3AORF092 


2 


26471 . . 26683 


70 


aaacgaaacaaaaggagggggttca 


acg 


taa 


100471 


3AORF093 


-1 


2524 . .2736 


70 


tccaccgttttcttcatagtactgt 


ccg 


cga 


100472 


3AORF094 


-3 


25334 . .25546 


70 


tggcgctttaatataaaagacgtct 


acc 


cga 


100473 


3AORF0 95 


3 


8316 . .8525 


69 


aagagatgggaaagacagaagaaca 


acc 


tag 


100474 


3AORF096 


2 


36992 . . 37198 


68 


aacaagttcaagggagctatgagga 


am 
acg 


tga 


10047S 


3AORF097 


-1 


32593 . .32799 


68 


aaagct taat acc tctgtcgtt tat 


acg 


taa 


100476 


3AORF098 


♦1 


15346 . . 15552 


68 


aatccattaaatcacctacctataa 


ata 


tag 


100477 


3AORF0 99 


1 


7225 . . 7428 


67 


actggtgactggatgaacagaaaag 


fa 
ccg 


taa 

uay 


100478 


3AORF100 


-2 


22620 . . 22823 


67 


cgacttcatgaccggcatgt ct taa 


ata 


t aa 


100479 


3AORF101 


-1 


40060 . . 40260 


66 


aaccttacagcgagaagggaaagag 


9 C 9 


taa 


100480 


3AORF102 


-1 


35035 . . 35235 


66 


t tctat ct cct t aaaataaagt t ag 


ccg 


taa 


100481 


3AORF103 


-2 


1149 . . 1349 


66 


att tttttggagtgttgggtaatca 




taa 


100482 


3AORF104 


1 


27661 . . 27858 


6S 


aaacaac t taaaggaggaacgacaa 


acg 


tga 


100483 


3AORF105 


-2 


9420 . . 9617 


65 


gcctaagtcaaccgcttgat tagac 


a t-rr 

acg 


cga 


100484 


3AORF106 


-2 


23244 . . 23438 


64 


caccagtaattcttgaattagttga 


aha 
dtd 


taa 


100485 


3AORF107 


2 


11966 . . 12157 


63 


tctaaaaaagacgctgtagt agacg 


ccg 


taa 


100486 


3AORF108 


-3 


35054 . .35245 


63 


t tt t cat cat t tctatctccttaaa 


ata 


taa 

cay 


100487 


3AORF109 j 


-3 


16010 . . 16201 


63 


gttcttaattccaacgtactgacag 


ccg 


t aa 


100488 


3AORF110 


-1 


6184 . . 6372 


62 


at tttcagtgactc cacaac agtac 


att 


t aa 


100489 


3A0RF111 


-2 


16500 . .16688 


62 


gtagt caacaac cyccttyc attga 


tta 
ccy 


taa 


100490 


3AORF112 


-2 


8502 . .8690 


62 


cttaat tec cgcccga cact tttuc 


att 


taa 


100491 


3AORF113 


1 


ft J 1 ^ *ft ft) A ft) A 1 

34 162 . . 34347 


6 1 


CdLyddyydt. t ayyay uywyoui-yvM 


ata 


tga 


100492 


3AORF114 


2 


12356 . .12541 


61 


ggaCaCCdCdCLddyy<-l.aLdy^ua 


ata 


taa 


100493 


3AORF115 


-2 


763 5 . . 7820 


61 


CgaagC CCCCCCdyCt.di,dn.yi.ytt 


att 


tga 


100494 


3AORF116 


-1 


ft! . -ft J ~i e e \ ^ 

26434 . .26613 


59 


LLCayCttCLyddyt Lytoaaattt 


eta 


tga 


100495 


3AORF117 


-3 


17804 . . 17983 


59 


atagecat cacccccagctwyuy tc 


ata 
auy 


tqa 


100496 


3AORF118 


2 


27899 . . 28075 


58 


at tgaaaagct aat t tccccataag 


att 


taa 


100497 


3AORF119 


-1 


39268 . . 39444 


58 


acgaaaccggtcaact tgt t tagat 


ata 
acy 


tqa 


100498 


3AORF120 


-2 


37152 . . 37328 


58 


tagctattaccacgaaaccccagcc 


ccy 


taa 


100499 


3AORF121 


-2 


i n r\ f\ f\ 1 T ^* 

18900 . . 19076 


58 


aaggcacccucwccuau u.L.c».wv«a.*-u 


att 


taa 


100500 


3AORF122 


-1 


21550 . . 21723 


57 


c aagcaegge aa icacc l c aa 


ata 


taa 


100501 


3AORF123 


-3 


-i -> /-\ ^* ^ 1 ft -j c 
33062 . .33235 


57 


aaaege Cy l,UC 1 1, taotaoyaui.n. 


ttQ 


tag 


100502 


3AORF124 


2 


Hill «)l ftflft 

21212. .213d2 


bo 


aa a*» fc *a^aaora^fl^"^A3 aQQSOdQS 

addLL dyadydyyt »**3y **j 


ctg 


tag 


100.503 


3AORF125 


-1 


•ft ^ e 1 ftftftfti 

22051 . .22221 


56 


a a a fr» <^a <ft<ft af-f-rraa^f" or" t" ^ fCft A 


atq 


tga 


100504 


3AORF126 


- 2 


"1 Q ^ ft ft n n i 

7821 . . 7991 


5 6 


CyttttLCCLyLiLLo(.yyn.i.(.La 


att 


tga 


100505 


3AORF127 


-3 


ft a i i •» I a a a "ft 

34712.. 34882 


56 


K fc -/-»/-»a k * ^ac^^^'a^^rtr'aAA^ nCtaQ 
LtgCdL LdCttdL t.ytyaoi.yuttty 


ttq 


taa 


100506 


3AORF128 


-3 


24056 . . 24226 


56 


etc ccaaaaccaaagcguc ul i_y uu 


ata 


taa 


100507 


3AORF129 


-3 


4940. .5110 


56 


cataccatgcagttaatacaaacaa 


ata 


tga 


100508 


3AORF130 


3 


27171. .27338 


55 


cagaattaactatcgatgatttcga 


atg. 


taa 


100509 


3AORF131 


-1 


40387. .40554 


55 


cct tctggcataataataat tctat 


ctg 


taa 


100510 


3AORF132 


-2 


1860. .2027 


55 


gcgataacattcacctccttaacgc 


att 


tga 


100S11 


3AORF133 


-3 


42317 . .42484 


55 


acaaagttctttcgtattgtagtaa 


ctg 


tag 


100512 


3AORF134 


2 


12671. .12835 


54 


tcatacaaatctttaaaaggttgga 


ctg 


tag 



203 



100513 


3AORF135 


-1 


39484 . .39648 


54 


ataatagtatttagcttctgcccag 


att 


taa 


100514 


3AORF136 


1 


29710. .29871 


53 


accttacaacaaaaaatactatcac 


att 


taa 


100515 


3AORF137 


1 


37186 . .37347 


53 


ggcagttgtttgaaaatataaggga 


9^9 


taa 


100516 


3AORF138 


2 


20996. .21157 


53 


aatqqggaaatagtttttaacgaag 


att 


taa 


100517 


3AORF13 9 


3 


15114. .15275 


53 


tcaactgaaattgaagtaagtttaa 


atg 


taa 


100518 


3AORF140 


3 


29442 . .29603 


53 


aaaatggtattaggaggattatcaa 


atg 


taa 


100519 


3AORF141 


-1 


39883 . .40044 


53 


tacaccataatcttttccaaatcga 


att 


taa 


100520 


3AORF142 


-1 


20416 . .20577 


53 


accacctqqaaaagtcccataaaaa 


att 


tga 


100521 


3AORF143 


-1 


1942 . - 2103 


53 


acaaagcttagaagttgactgatca 


ate 


taa 


100S22 


3AORF144 


-3 


39380 . .39541 


53 


ttccaccagtttcatctcttaagaa 


ate 


taa 


100S23 


1AORF145 


3 


20388 . .20546 


52 


tctgagtggtcagaattagctatta 


atg 


taa 


100524 


1AORF146 


-2 


2358 . . 2516 


52 


aacatgtccatattatgaacaatca 


att 


tga 


10052 5 


JAvKf X ** » 


_3 


5606 5764 


52 


gtgat t tgtttgtggtagatat tea 


att 


tga 


10 0526 


1ADRF14 S 
jnuivf its 


2 


34145 .34300 

J *t -l» ^ J t » J T w w 


51 


t t tactt c teegt t ttatatgaagg 


att 


taa 


10 0527 


1 AORF1 4 9 


-1 


7918 . . 8073 


51 


tattct ct tgafct tactaattctaa 


ata 


taa 


i 0052 ft 


■I inopi 50 

JAUKf IjU 


-2 


11745 11900 


51 


t tcatcct tatgtc tttgaCcagca 


ata 


taa 


10 05 2 9 


t AORP1 51 

J AUKT X J X 


_ 3 


7097 7252 


51 


t ttacc t tea tgataccegtat aca 


ata 


tga 


100530 


1 AORF1 52 


1 


21652 21804 


50 


ctaaaaat at t agagacat caaga t 


gtg 


taa 


1005 3 1 


1 AfJRFl 51 


2 


5381 . . 5533 


50 


t egget aagt ctgaatt act at taa 


gtg 


tga 


i nn5 1 2 


1 aoopi «;4 


_ ]_ 


39670 39822 


50 


t toataaaatcatcttctttcaaaQ 


ata 


taa 


100533 






38233 . .38385 


50 


at aggct ctacaaaatgcaccaaca 


att 


tag 


1UU3J4 


iinpPi 56 


_ 


33040 33192 

J J V T V * < -J J ^ J 


50 


t at ctgaaat ataatget t tgt taa 


att 


tag 


i nncic 


JnUKr 13 / 


_ 2 


10119 10271 

1U117 > . 1U« / 1 


50 


rftcaataatttactataactatta 


att 


tqa 




1 anon c: a 

J rtUKr 130 


_ 3 


36074 36226 


50 


at-ccatcttataatttcattctOQC 


att 


taa 


JLUUd J / 


JHUKf 13 7 


j 


1 Di-JQ lft490 

lOJ JO ■ .1017U 


50 


t aaataotttctattatttaaatta 


CtQ 


taa 




~x anon en 
jAUKr idu 


j 


3 9399 3 954ft 


49 


of ttaat t aataaaaatoacaaaac 


ttg 


taa 


100539 


jAUKf JLO 1 


-j 
- & 


ft 976 9125 
07/0 . . 31*3 


4 9 


ff at actfttaatttttaaacttaa 


ttQ 


tga * 


1 U Ub 4 U 


JAOKJf J. b z 


j 


11199 1114ft 
J1173 1 . JUiO 


4 9 




CtQ 


taa 


100541 


JAUKf lb J 


• j 


u^cq 1 ^ cn ft 


4 9 




ate 


tqa 




JnUKr IQt 


3 


25182 25328 


48 


t tt t tt ct tagctt t ttctgataaa 


gtg 


tag 


1 U U 3 <* J 


J ML/ Kr ID 3 


3 


28353 28499 


48 


aat ct tgt ct etatgacaeggaaag 


att 


taa 


1 U U D ** ** 


JAUKf lOO 


_ i 


ft ft 99 904 5 
0 □ j j . . j \j ^ j 


48 


atactacotcacttoctctttttaq 


ttq 


taa 


. 3 


-i anpPi £7 

JnUKr lO / 


_ 2 


411 . .557 


48 


taatacaaattaacatttaaatcct 


ttq 


tga 
3 


1 nne; jd; 

0 


JAUKf 10 O 


_ 3 


25973 26119 


48 


get gag t act tcaatgtgaagatt a 


atg 


tag 


1UU3 4 / 


JnUKr 10 3 


.3 


251 51 25797 


4 8 


aaaaaaacacctacaaatat aaaca 


ttg 


tag 




i anp pi 7rt 


J 


74995 25141 

«1773 ■ .*jX*tl 


48 


taaoaaaaaaaattaatattcattc 


att 


tag 




JAUKf 1/1 




21417 21580 

i. J ** J r . . i. J J u VJ 


47 


aaaaataataacataaaaaacqoct 


att 


tag 




i ar>o pi T5 

JnUKt 1 / x 


2 


12414 12557 


47 


ctatttoaccctactataaaaaaQt 


atg 


taa 


1UU3D1 


t anp PI 71 
JnUKr i / j 


_ 2_ 


19005 38148 

J 0 \j \t j ■ - j u x 1 u 


47 


ataaattatatcatcaaaataatcc 


atg 


taa 


i nncco 

1UU33£ 


i ano pi 7 a 
j nUKr 1/4 


_ 1 


4121 4266 


47 


at t taaagat tgat aagc t t ctcct 


gtg 


tga 


1UU33J 


~X &ODP1 "7 C 
JAUKf 1/3 


_ ]_ 


1124 1267 
J141 . . j * 0 r 


47 


ttcatttaaaaatacttaactttca 


ttq 


taq 


1UU3 34 


JAUKf I/O 


1 


580. .723 


47 




ata 


taa 


1UU333 


■a nnppi 77 

JAUKf 1 / / 


_ 2 


19819 39962 


47 


ttaaaaatctttctaatttccataa 


ate 


tag 


100556 


1AORF178 


- 2 


38466 . , 38609 


47 


t tagcgtcttcatcttgagcaccat 


ata 


tag 


1UU3 3 / 


JAUKf 1/3 


„ 2 


11927 34070 


47 


1 1 1 1 aepcaat caacaaoc t tattc 


atq 


tga 


1005 58 


JnUKr low 


-2 


33555 33698 

w w w w w • ■ J J vj y w 


47 


cgtctttcgggattttacagtatta 


att 


tga 


100559 


1 AORF1 ft 1 


-2 


29538 . . 29681 


47 


atagtattttttgttgtaaggtcat 


att 


tga 


1UU30U 


i anp pi a 2 

J AUKf IS* 


_ 3 


17099 17242 


47 


a at a tcactactgcctgcat aaggt 


ate 


tag 


IwUjgl 


1AORF1 91 

JAUKf XO J 


2 


23750 23890 


46 


t taaaaaaacaaacgtttttagtat 


ata 


taa 


XUv30* 


"i AORF1 84 

J AU K f 1 Q *l 


_ 1 


31648 31788 


46 


tggaagtttcagatttgcaggaact 


ttg 


tga 


i no5 ci 

1UU3 OJ 


1 AORF1 ft 5 

jnvnr x o j 


. 1 


30565 . . 30705 


46 


attttgtttcaaataaagctattac 


ate 


tag 


1UU3 0 4 


IIHRPI H6 
JAUKf 10 0 




16951 17091 

X O J/ _> i. - - X / W V X 


46 


gaga at tcaaagt act agtgta taa 


atq 


tga 


1UU j 03 


JAUKf 10/ 


-1 


7153 . .7293 


46 


tatccaacgaatacttttttgaaga 


att 


taa 


1UU3 0 O 


-> anRPi ft ft 

J AUKf 1 O O 


- 1 


1237 . .1377 


46 


ccagctct tctaaagaaacaat ttc 


att 


taa 


100567 


3AORF18 9 


-2 


33309 . . 33449 


46 


cat ttgagaagccgatgcttcatat 


ate 


tga 


IuvjOO 


1AORF190 


-2 


7197 . . 7337 


46 


gtaacgaacttgcagaatcctctga 


atg 


taa 


100569 


3AORF191 


-3 


41459 . .41599 


46 


tcatctgtataaactgcaccgttag 


ata 


tag 


100570 


1AORF192 


3 


4863 . . 5000 


45 


gatgctattattaacgctttagcag 


att 


tag 


100571 


3AORF193 


3 


25965 . . 26102 


45 


tatacgatactagtttagactcttt 


ata 


tga 


100572 


3AORF194 


-1 


37069 . .37206 


45 


ctagtaagaataataatcttagtat 


ttg 


tga 


100573 


1AORF195 


-1 


11749 . . 11886 


45 


tttgatcagcaatagctaataattt 


ate 


tga 


100574 


3AORF196 


-2 


40764 . .40901 


45 


atctttagcaacttgtttaggtgct 


atg 


tga 


10057 5 


1AORF197 


-2 


31989 . .32126 


45 


ggctaaaaaccccacctattgactt 


ata 


tga 


100576 


3AORF198 


-3 


36431. .36568 


45 


tttatttatgacataactaccattc 


ata 


tga 


XUu3 / / 


iinRPi 99 

JAUKf 1 J7 


.3 


33515 33652 

•J J J X J • . _) J \i J ft 


45 


ttccaaaaat taactatgttaggat 


ttg 


tga 


i 0057fl 


1 AORF200 

JAUKf X W 


-3 


21233 . .21370 


45 


ataagattataacctatgactctag 


att 


tga 


100579 


3AORF201 


1 


23293. .23427 


44 


aagcctatcggtggtgtgatatcta 


gtg 


taa 


100580 


3AORF202 


-1 


39088. .39222 


44 


atagtcaaatttacatcctggctcc 


att 


taa 


100S81 


3AORF203 


-1 


16309. .16443 


44 


tttgcttgccgtctaaaatcaactt 


ata 


tga 


100582 


3AORF204 


1 


23845. .23976 


43 


atgtttattatcaatcaaaatataa 


att 


taa 


100583 


3AORF205 


1 


29500. .29631 


43 


gcgttgtgcttcacggtcttagcga 


ttg 


taa 


100584 


3AORF206 


2 


16667. .16798 


43 


gaaaaatcaacagtcttaaatttaa 


ttg 


tag 



204 



100585 


3AORF207 


-1 


3S386. .35517 


43 


tgcagatttatagactccttcttga 


ate 


taa 


100586 


3AORF208 


-1 


30013. .30144 


43 


cage tgagctgtt teat cttttggc 


att 


taa 


100587 


3AORF209 


-1 


28366. .28497 


43 


taattcctggtctctagttgggttt 


ata 


tga 


100588 


3AORF210 


-1 


15739. .15870 


43 


catcaagcttatttgattccactga 


gtg 


tag 


100589 


3AORF211 


-1 


7693 . .7824 


43 


taactgaagttccctcagctacacc 


gtg 


tga 


100590 


3AORF212 


-2 


4314 . .4445 


43 


ggttctgaaacaatttctttagaaa 


gtg 


tag 


100591 


3AORF213 


-2 


4011. .4142 


43 


tgtttgatgtcttccatatcaatat 


ttg 


taa 


100592 


3AORF214 


. -2 


1722 . .1853 


43 


tctgtctagtttcaactgaacatta 


ttg 


taa 


100593 


3AORF215 


-3 


16616. .16747 


43 


tcttcatttgtttgcgtattagcat 


ate 


tag 


100594 


3AORF216 


-3 


15833. .15964 


43 


gtcattttgaccgaagttttttgat 


ttg 


taa 


100595 


3AORF217 


3 


6363. .6491 


42 


gatgcagagctccaaacatatataa 


att 


taa 


100596 


3AORF218 


-1 


32146. .32274 


42 


aataagctataattaagatttcgaa 


ate 


taa 


100597 


3AORF219 


-1 


29800. .29928 


42 


ctagggtcatcactttgttcgttta 


ate 


taa 


100598 


3AORF220 


-1 


18409. .18537 


42 


gcattaacctgatacgcttcttctc 


ctg 


tag 


100599 


3AORF221 


-1 


13234. .13362 


42 


ttttatcgcctaaccaagatgcacc 


ate 


tag 


100600 . 


3AORF222 


-1 


12313 . .12441 


42 


cccaagctttatctgaggcatctga 


ata 


tga 


100601 


3AORF223 


-1 


4915. .5043 


42 


tccatcatagttaattccaaaataa 


ttg 


taa 


100602 


3AORF224 


-1 


2125. .2253 


42 


attaactactttataatcttcatac 


att 


taa 


100603 


3AORF225 


-2 


26298. .26426 


42 


tcgtttgtaacaacttgatttaaga 


ata 


taa 


100604 


3AORF226 


-2 


17184 . .17312 


42 


cgcctatttttaaattatctaattt 


att 


tag 


100605 


3AORF227 


-2 


1425. .1553 


42 


atcttcttcccattctctatagggt 


att 


taa 


100606 


3AORF228 


-3 


31055. .31183 


42 


cattttttgatgtcaggcagtttat 


ata 


taa 


100607 


3AORF229 


-3 


22592. .22720 


42 


gttataaccatgaccggctacaagc 


ata 


taa 


100608 


3AORF230 


-1 


27883. .28008 


41 


gaaggcagggtcgtttcttggatta 


ttg 


tag 


100609. 


3AORF231 


-2 


29988 . .30113 


41 


gcttctttaactttctcttgtacaa 


ttg 


taa 


100610 


3AORF232 


-2 


22485. .22610 


41 


tatctgggaaatttaatctaataaa 


ata 


tag 


100611 


3AORF233 


-2 


9264 . .9389 


41 


aagtttgccgaaatgactttgagct 


ate 


tga 


100612 


3AORF234 


-3 


23033. .23158 


41 


acctaattcagataagcgataattt 


ata 


tga 


100613 


3AORF235 


1 


25558 . .25680 


40 


aacactgctgaaatagacgtctttt 


ata 


tag 


100614 


3AORF2 36 


1 


34420. .34542 


40 


acattgagagaagtttcagaaaaat 


ate 


taa 


100615 


3AORF237 


3 


38442. .38564 


40 


gaagaagctatagaaacttttattc 


ctg 


taa 


100616 


3AORF23 8 


-1 


33628. .33750 


40 


caatcattagaaaaccttttttcat 


ata 


taa 


100617 


3AORF23 9 


•1 


29248. .29370 


40 


tcttctaatttagaaatattaatca 


atg 


tag 


100618 


3AORF24 0 


-2 


18156. .18278 


40 


gtctctcaattctgtatagaatttt 


att 


taa 


100619 


3AORF241 


-2 


8088. .8210 


40 


tttcaaggcttttgtataagtttta 


gtg 


tga 


100620 


3AORF24 2 


-3 


39149. .39271 


40 


ttagcaaagcagatttacctacacc 


ttg 


taa 


100621 


3AORF243 


-3 


23558. .23680 


40 


aaaattaactgtttattaattttaa 


ata 


taa 


100622 


3AORF244 


-3 


1697. .1819 


40 


catttcattaaaggattattattaa 


ata 


tga 


100623 


3AORF24 5 


1 


19015. .19134 


39 


agttatgcaaggaatatgatgactt 


ttg 


tag 


100624 


3AORF24 6 


1 


22504. .22623 


39 


gctaatctaaacactttcacatcgt 


ttg 


taa 


100625 


3AORF24 7 


-1 


40567. .40686 


39 


aaagtatttacttgttctttattcc 


ata 


taa 


100626 


3AORF24 8 


-1 


23956. .24075 


39 


tttagattcatgaaacgaagtagca 


ata 


taa 


100627 


3AORF24 9 


-I 


11113. .11232 


39 


cacctttccccaacacttttacagt 


ate 


tga 


100628 


3AORF250 


-1 


8719. .8838 


39 


ttttattagcttctactagctttaa 


ata 


taa 


100629 


3AORF251 


-2 


16899. .17018 


39 


aactcgtctgttaagcgcttgttga 


att 


tga 


100630 


3AORF252 


-3 


37025 . .37144 


39 


acaactgccctaatttaataactgc 


att 


tga 


. 100631 


3AORF253 


-3 


29138. .29257 


39 


tctacatactccaaacaattgatgg 


att 


taa 


100632 


3AORF254 


-3 


15476. .15595 


39 


caaatcaattcattaaaatccatta 


ctg 


taa 


100633 


3AORF2 55 


1 


13552. .13668 


38 


ttaatagacaaagtaaaatcgtggt 


ttg 


tag 


100634 


3AORF256 


2 


12545. .12661 


38 


aaaagtgcaaagggctggctaacgg 


ata 


taa 


100635 


3AORF2 57 


2 


41870. .41986 


38 


gggcatggattaaacttacaacaag 


gtg 


tga 


100636 


3AORF258 


3 


10827. .10943 


38 


teaaacttttgaaaaaeggtttagg 


att 


taa 


100637 


3AORF2 59 


-1 


34570. .34686 


38 


gtgacatcgaaccagtacggatcac 


gtg 


tga 


100638 


3AORF260 


-1 


32389. .32505 


38 


aagcaggtaagccaatacgcattga 


att 


tag 


100639 


3AORF261 


-1 


23830. .23946 


38 


cctttttaacttttaataaaattaa 


ata 


tga 


100640 


3AORF262 


-1 


8158. .8274 


38 


ccatctcttctggttcagtttctga 


ate 


taa 


100641 


3AORF263 


•2 


14001. .14117 


38 


ttatacctgcatttcctcctgattc 


gtg 


tga 


100642 


3AORF264 


-2 


294 . .410 


38 


tttgcttgtttttattttcccttga 


gtg 


taa 


100643 


3AORF265 


-3 


42683. .42799 


38 


tgacaaagataattatctctatcta 


atg 


tga 


100644 


3AORF266 


-3 


31979. .32095 


38 


aatcctcatcatcagtgtctaattc 


ate 


taa 


100645 


3AORF267 


-3 


26306. .26422 


38 


ttgtaacaacttgatttaagaatac 


ate 


tga 


100646 


3AORF268 


-3 


16490. .16606 


38 


tacatacaaggcttagcttttttat 


ttg 


tag 


100647 


3AORF269 


-3 


9872. .9988 


38 


tgagacccctctaaccctgagttag 


ata 


tag 


100648 


3AORF270 


1 


21829. .21942 


37 


atagttaagagtcagtgcttcggca 


ctg 


tag 


100649 


3AORF271 


2 


29468. .29581 


37 


tgagcgacacatataaaagctacct 


att 


taa 


100650 


3AORF272 


3 


2955. .3068 


37 


gagttaaacagattttacttgcagc 


ata 


taa 


100651 


3AORF2 73 


3 


5010. .5123 


37 


tttggcaaaccagtagtatttacag j 


atg 


taa 


100652 


3AORF274 


3 


19956. .20069 


37 


tcaagtatagatgaattaaagcaac 


ttg 


tga 


100653 


3AORF275 


3 


39882. .39995 


37 


gatatgttaccaacaggaaatgtag 


att 


taa 


100654 


3AORF2 76 


-1 


27211. .27324 


37 


attaagtgegcttatttaattagat 


att 


tga 


100655 


3AORF277 


-1 


13516. .13629 


37 


cgaccgtcattaaagttaagtccac 


ctg 


tga 


100656 


3AORF278 


-1 


11893. .12006 


37 


ttttatatacacgaccactggataa 


ate 


taa 
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100657 


3AORF279 


-2 


17535. .17648 


37 


cttgtaaagatttgtttactgctgc 


ttg 


taa 


100658 


3AORF2 80 


-2 


6474. .6587 


37 


tcaaaataagcatctaactgactag 


atg 


taa 


100659 


3AORF281 


-2 


759. .872 


37 


ttttgataccgtcgcgtcataatgg 


att 


tga 


100660 


3AORF282 


-3 


36608 . .36721 


37 


cccaaaacctccttgactcgatcta 


ata 


tga 


100661 


3AORF2 83 


-3 


14960. .15073 


37 


cctcagtcgaagaaccatcttttaa 


att 


taa 


100662 


3AORF2 8 4 


1 


18859. .18969 


36 


atgctaacagagccaggtctttact 


att 


taa 


100663 


3AORF285 


2 


8237. .8347 


36 


aaaacttacacaaaagccttgaaag 


ata 


taa 


100664 


3AORF2 8 6 


3 


5157. .5267 


36 


tatgatcagcaacgtacattagaca 


gtg 


tag 


100665 

1 w V U U v 


3AORF2 87 


3 


38610 . .38720 


36 


tttgatttagtacgcatacacttat 


atg 


taa 


100666 

i y u y u v 


3AORF288 


-1 


36454 . .36564 


36 


tttatgacataactaccattcatac 


ata 


tga 


100667 

A V w sJ O ' 


3AORF2 89 


-1 


30217 . . 30327 


36 


aacaactttttcataatgctcttct 


ttg 


taa 


X w w o o o 


3AORF2 90 


-1 


16678 . . 16788 


36 


gcttttttgcaaattctaacagctt 


ate 


tga 


x u \j o a v 


3AORF291 


-2 


14310 . .14420 


36 


gtctagttaaagggataaccatctc 


ctg 


tga 


100670 

x v/ w w / w 


3AORF292 


-2 


11457 . . 11567 


36 


ttctttcaattctttgattttctga 


ttg 


tga 


100671 


3AORF293 


-3 


29462 . . 29572 


36 


ttcataaaagtattccttataaaat 


atg 


tag 


100672 


J rtvjrv r x it 


-3 


2 2388 . . 224 98 


36 


accattccaattttggccaaacgat 


gtg 


tag 


100 6 73 


T AORF2 95 


-3 


18629 . . 18739 


36 


aaaaggaacgcctc t tgagtgaagt 


att 


tag 


1 00674 


"l AHRF7 96 


-3 


6332 6442 


36 


t atcagacatgaagt ctgaaggtaa 


ate 


taa 


i no675 


J /\vjfv v x y t 


\ 


13984 14091 

X J J U i * • X w J X 


35 


aaatggt tgaagtcactt aaaggta 


gtg 


tag 


1UUQ / O 


-i nnRF7 98 


\ 


40174 40281 


35 


C atcaaatgttgcat cat tttt tga 


gtg 


taa 


1 00677 




2 


1481 1588 


35 


gccgcgtgtgct act t ttgcgttag 


ata 


taa 


i oo 6 7 r 


J AW l\T www 


2 


404 51. .40558 


35 


aatataaattttgtatacaaggtga 


ata 


tag 


1UUQ / J 


i anRPi oi 

J rVvjrvt Jul 


3 


25479 25586 


35 


accact agt t aaaacttcatatact 


ata 


taa 


i 006 fto 


3AORF3 0 2 

jnurvf j w x 


3 


32106 . . 32213 


35 


gaagatgatttcgatgaattagaca 


ctg 


tga 


i oo fifli 

lUUDOl 


AORF303 


3 


36024 . . 36131 


35 


gacacagagggat tattaaaagaga 


ttg 


tag 


1 00 6 89 
1UUS ox 


T AORF104 




37762 . . 37869 


35 


accgacaaatccgccaacatctttt 


ata 


tga 




3AORF305 

■J w fx Jt W W W 


__ 


24088 . . 24195 


35 


tttatctttaacaaaatcaaactga 


ata 


tga 


1006R4 


3AORF306 


__ — 


19507 . . 19614 


35 


atcattaggtaattgaaattttaaa 


ata 


tga 


luUOOD 


7AORF^07 





16081 . .16188 


35 


atgtactgacagttgcagatacagt 


ate 


tag 


1UUO DO 


-l&ORP'iOfl 
J nvj [\t J U O 


__ 


11398 . . 11505 


35 


tttctt tagttctagttaaaatgtt 


ttg 


taa 


i oo6fl 7 


UORRTQ9 


- 2 


3 3003 . . 33110 


35 


aaacagacctcttacccgttcatca 


ctg 


taa 


1UU a a a 


-lino ci i n 


_ 2 


24894 25001 

6 1 O 71 * • m* J w w X 


3 5 


gt aaatcgaaatcgctaccagctga 


att 


taa 


1UUO 0 j 




- 2 


22005 22112 


35 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


1UUO 7U 


"a AfiRF"i i 7 


_ 2 


21711 21818 

XX f X X - * 4 1 U-iU 


35 


aaaataaaaagccagtgccgaagca 


ctg 


tag 


1UUQ71 


JnUKc J x J 


_ 2 


17901 18008 


3 5 


cattaggtct tagacgacttagcat 


ata 


taa 




-3 nUK r J X ** 


_ 2 


16710 16817 

X w / X V . * X VJ U X ' 


35 


taattcagtcttaggagtatcattt 


att 


tag 


T D06 Q 7 


J hUR C j x o 


_ 2 


15990 . .160 97 


35 


acatat ct ccgtatcat t tgggtaa 


att 


tag 


100694 


laORPl 1 6 


• 2 


2862 . .2969 


35 


aattcttcttcatactgtttgacga 


ttg 


tag 




1AORP1 1 7 
J nwi\C «J x / 


-3 


40217 . .40324 


35 


tccctaacactactttttaaacttt 


ata 


tga 


i no696 


TAORF318 


_3 


37535 . .37642 


35 


tgttcggctcctttcattattttaa 


ata 


taa 


100697 


1AORF319 


_3 


34421 . .34528 


35 


ttcttcatcttttatttgactctgc 


ata 


tga 


i nn 6 q fl 


1AORF^2 0 


- 3 


28262 . .28369 


35 


catttgttggtaatatcttagttcg 


atg 


tga 


10069 9 


1AORF321 


1 


23989 . . 24093 


34 


taaaaaggtttaatataaaaatgta 


ata 


tga 


1 00700 

±\J \J /WW 


3AORF322 


1 


34660 . . 34764 


34 


aagagaagattgagaccatggcttt 


atg 


taa 


1 00701 
xuu / w X 


1AORF321 


3 


30105 . . 30209 


34 


ctaaatactgaactatcaactgtag 


att 


taa 


100702 


1 AORF324 


3 


30258 . . 30362 


34 


ggaaaagagttccttaaaaaagcag 


ata 


tga 


100703 


T ftAR FT 2 5 


3 


40236 . . 40340 


34 


gttgtatcatttttggtgatgcaac 


att 


tag 


100704 


1AORF326 




36964 . .37068 


34 


cgcatcaacaactgtaaacctttga 


ttg 


tga 


1 0070 5 


■^AORF'?!27 


_ i 


35242 . .35346 


34 


atttttgtctgttgtataatatttt 


ctg 


taa 


i 00706 


1AORF3 2 8 


-1 


21916 . .22020 


34 


ccatttaccttcttgagatgttgga 


ttg 


tga 


100707 


3AORF3 2 9 


-1 


18820 . . 18924 


34 


ggtggcttaacttccaagaaccaac 


ctg 


taa 


1 00708 




_1 


15631 . . 15735 


34 


ttatgaagttttcacaaattagtaa 


ate 


tag 


100709 


3AORF3 31 


-2 


37998 . .38102 


34 


t t acgcccaat age t teat act cat 


ctg 


tag 


100710 


3AORF33 2 


-2 


7359 . .7463 


34 


tttataaacctttaaagttttagtc 


ata 


taa 


100711 


3AORF3 33 


-3 


24584 . .24688 


34 


aaaaattataaaactataaaaccat 


ate 


taa 


100712 


3AORF334 


-3 


24269 . .24373 


34 


tatttttaggtagataatttattaa 


ate 


.tga 


100713 


3AORF33 5 


-3 


14273 . .14377 


34 


cacttcagcaagttgatgctttgta 


ate 


tga 


100714 


3AORF33 6 


2 


7559 . . 7660 


33 


gtaactttatctaatttagaagegg 


ata 


tag 


100715 


3AORF337 


2 


13277. .13378 


33 


aatataggtaaaaaagcaggagaat 


ttg 


tag 


100716 


3AORF3 3 8 


3 


9501 . . 9602 


33 


taggacgtacgatgacgatgggcgt 


ate 


taa 


10 0717 


3AORF3 1 9 


3 


27348 . .27449 


33 


atatctaattaaataagcgcactta 


att 


tga 


100718 


3AORF340 




37372 . .37473 


33 


ttctatggttttcatcttatgagaa 


atg 


taa 


100719 


3AORF341 


_ x 


33421 . .33522 


33 


aagctaattcggacacttttccttt 


ttg 


taa 


100720 


3AORF342 




29047 . . 29148 


33 


tttggcatctctatcactcctttag 


ata 


taa 


100721 


^AORFl43 




7549 . . 7650 


33 


atgatacgcctgagactagaattgg 


att 


taa 


100722 


3AORF34 4 


— — - 


7297 . . 7398 


33 


ctgctgaaactgttgcagattttga 


att 


tga 


100723 


3AORF34 5 


-2 


23850. .23951 


33 


ttaaacctttttaacttttaataaa 


att 


taa 


100724 


3AORF34 6 


-2 


20607. .20708 


33 


aaagatgtacgactagatttagtta 


ate 


taa 


100725 


3AORF34 7 


-2 


14175.. 14276 


33 


atctgttgttaaagaacgctaataa 


ctg 


taa 


100726 


3AORF348 


-2 


6984. .7085 


33 


cgtacactggttgacctgttaaacc 


ate 


tag 


100727 


3AORF349 


-2 


6882. .6983 


33 


tagaacgaccaataactgtatttag 


ate 


taa 


100728 


3AORF350 


-3 


40748. .40849 


33 


aactgcaattcactaaatgctgtaa 


gtg 


tga 
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100729 


3AORF351 


-3 


38345.. 38446 


33 


ggttagtagaatgtttttcgtataa 


ate 


taa 


100730 


3AORF352 


-3 


38081. .38182 


33 


tagttgaaggccaatacattaacct 


atg 


taa 


100731 


3AORF353 


-3 


35432. .35533 


33 


tagcattctcatatgatgcagattt 


ata 


taa 


100732 


3AORF3 54 


-3 


34952. .35053 


33 


ttatcctgatacagatatctcttag 


ate 


taa 
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Table 9 



Bacteriophage 96, complete genome sequence 



1 catagttata ggcttttcag ctatatacca 

71 gaaaccttga tttaatgggg ttttaatcta 

141 cgttgacctt gctctttttt atgttcatca 

211 aatggcctaa tcttttgcta atatattcaa 

281 tcttaatgaa taaggtgtta tcgtagtatc 

3S1 ttaacggcat tatgactcaa tttaaacaac 

421 taatatgttg tatatccttt tttggtacct 

491 atgtattgta ccctcttttt cgtttagatc 

561 gtgatagcta ggatgaataa aaaaatataa 

631 agtattgttc tatggtgatg aatttagagt 

701 atagctaggg tctttcttta aatagccctc 

771 aatttacgaa ccgtttcatt agtacgacct 

841 tgatgttttt tattaaaaaa tcactcccga 

911 gaattgttgt gaagcgacat gtttcttatt 

981 ttattttcat ctaaattgtt tccatcatcc 

1051 ctttagtttt gaatcctgac tttcttttct 

1121 agatgctgtt gctttattct tcctttttgt 

1191 ggcaaaaaat aataagggta ggcgagctac 

1261 cttttcctac ttcttttcta aaactatcat 

1331 tccagcatgt tggtttttgt ccggattatt 

1401 tcgtaactag gttcgtttgg gtcgcgtggt 

1471 gtacctgttg cttagatgtg ttattggttt 

1541 ttgattattg ttatcgtttt gattactatt 

1611 ttgtctttgt tctctttctt tgtttcggtt 

1681 atgcacctaa cactaacgca ctagctaata 

1751 tgctatttgt tttaataaat ctatgatttc 

1821 tcgtctaaca tctctattaa gacgaaattt 

1891 taggafetaga aaacgaacta ctgaaacgcg 

1961 taacatatct ttaccgctct cagacattgt 

2031 tattttgttt cctgatttct ttcgatttct 

2101 tatcacgttt ttcagaaact gacatacgat 

2171 ttttccggca gtccaagact ctttaactgt 

2241 ccttttctca tatttcttta tatttaaaaa 

2311 agttccaata ccgtatatct tcttatattg 

2381 tactcagaca actcatacaa gttacgtacg 

2451 ctgagataaa gccgtgtcgt cttgcgtaat 

2521 gttgccatac gtcaacttgt ggtgggcaag 

2591 gaaggtctaa taaaaatttc tccttcttga 

2661 tcacttcaac ttcacatttc ataagcaatt 

2731 tttctttcta tctctaaccc attgcataaa 

2801 ttatttgcat gaccggctat agtttcttga 

2871 taaagtaatc tgctaattgt tggacttttg 

2941 tgttgattga cttaccccga ttgcttcaga 

3 011 tctaagttct ctgataaaat ttttctagca 

3081 gtaatactaa tttaccataa gtaatatcac 

3151 tttaggtgtt gacatattac tttaagtgat 

3221 cagaaaattt taaagagttc tctgtaaagg 

3291 tgataaatta ggcgttacta aacaatctgt 

3361 caattgtatg ctttagccaa attattcaac 

3431 aatatcactt taagtgataa aggaggaaac 

3 501 ccagtaagga aaattgaagt ggaaggagaa 

3571 atgcacgagc agataacgcc atacgcaatc 

3641 gtcaggtcaa aacagaaata tgatcatcat 

3 711 aaacaaagta aaaacgaaaa cattagagaa 

3781 taccgacgtt aagaaaaact ggtgcttacc 

3 851 tgaagctaca gaagaaacaa aacaagaaat 

3 921 caaaaactgg atgcgggaga ctacaatttc 
3991 gactacatgc gataacaaac caaaaacaac 
4061 gatgactggt gcgagttcaa gaacgaacgt 
4131 aattggttcc cgtcacaagc tactttatac 

4 201 aggagaggct gaatatggaa tacatcggat 
4 271 aaaagatgat ctagagaaaa aagtctactc 
4341 cgaggacaaa agcgttatat aaaaattgac 
4411 aatacgaatt ataggaggag ttatcaaatg 
4481 tcacagtctt agcgattgta cttatgccgt 
4 551 aagtatcgca acattcatat actacaaaga 
4 621 caagtaacag tgacaaacat ttatcaaaat 
4 691 tatggctgaa aatattaaaa ctgaacaaca 



agataagatt tatcccgccg tctccataaa aatatgcttg 
gcaagtgtca aatatgtgtc aagaaaataa ttttctgaca 
agtaagtgag agtaggtgtc taaagttata gatatattat 
taggtatacc tttagaaagt aggaaagatg tatgcgtgtg 
atttagtcct atttgactct tagcatggtt aaatgacttt 
ttattatctg tacgttttgg taattttgat aatttagctt 
ccacaagtct gtccgcgtta actgtttttg ttccacgaag 
gataggcaac atattaatta catcgctgta tcttgcacca 
ctcgattcgt ctctagattt aaagtattct atcaattgca 
gttcgtcttt tgattttttt gtaccacgaa tatctatttg 
atatactgca tctctgaagc attgtgataa acaactgttt 
cgaccgaatt cgttcaaaaa cttttgatac tccgaacgtt 
aatattcgtt aaataatttt aatgaacgtt gataccaata 
ttttgaatct aaccaatcat tgtaatattc ttcaaacttt 
aaatctctaa gcagttgttg agcagcgttg gttgcctcag 
ttcctgattt gaaagacgga tgttttacgt cgtactgcca 
aattgtaaat gacgccattt tacttttcct cctcaaaatt 
ccgaaatttt attgttgaac aactattgct tcacttcttg 
atgattgatt agggtgtgtt aacgacattc ctggaccacc 
ttccatttct tcagtggctc ttttagcatt taaatattct 
tgtgcttgtt gtccattatt ggtagctgga agattcttct 
gttgattgtt gttaatgttt gtgttgttct cgttgtttac 
ttcttttttc gcttctgctt tatctttagt ttctttcttt 
ttcttgcttt cctctttctt atcgccgtcg ttgctaccgc 
ataaaactaa taatcttttc atgttttaca ctcctttatt 
attgttttgt tctatgattt tgttttcatt tttaagatgt 
tgatttatca tttcgtaagt aaacatttga cctgtgttgt 
ttgaaaagct atctataaat tgaccaactt tattttttaa 
atttagttcg cgcttattta aagttttttc tataatcttg 
tctacttcaa aagggatatt gttattaaat ttttcgataa 
caaatacttg tttttgacct ttatttaact tccctcgaat 
taacttatca ttaggaactt gattcatctt ttatatgact 
ctctcaacgg ctcaaatgta atcgaatact cgccatagtg 
ttctattgcc tccaatatgt attcttcgct taattgtaga 
ccataattgt aagcttctac aatttcgcgt aacgggactg 
tttcgaactt gcgattgttg aatttcgatt gatctaaaat 
ttcttcatat aatactccta atttgttcct ttcggataag 
taccaaccat cgaatcctcg aggtactctt tgtgtttctt 
cttcgtattt tcccatgcgc caaacccctt tggtgtctta 
attttcgatt tcttcccatt cttcgggagt aaattcatct 
tgaatacttc tttcttctgt aattctcgac ttaggtacat 
atattctagg atatttaagt tctttaagcc agttagagat 
caattctact tgagtaatgt tgttctcttt cataagttgt 
ctcttatatt ccataatttt ctcctttagt attacttaat 
ttttcaatac aaaatattac ttttttgaaa taaatatcac 
agtatagttg taaatgtcaa cgggaggtga tacgaaatgc 
tctggagaac taattcgaat atgacacaac aagatgtcgc 
aataagatgg gaaaaagatg acgcagaatt aaaaggctta 
acagaagttg attatataaa ggctaaaaaa atttaacatt 
tgaaatgcaa gaattacaaa catttaattt tgaagaatta 
cccttctttt taggtaagga tgttgctgaa attttagggt 
atgttgatag tgaagatagg ctgatgcacc aaattagtgc 
caacgaatct ggattataca gtttaatctt tgacgcttct 
accgctagga aattcaaacg ctgggtaact tcggaagttt 
aagtacctag tgacccaatg caagcattga gattaatgtt 
taaaaacgtg aaagatgatg ttattgattt gaaagaaaat 
ttaactagaa caatcaatca aagagtagct catatacaaa 
gtagcgaatt attcagggat attaattcag aagtgaaaaa 
aagacaaaaa catttcgacg atgtaattga aatgattgct 
agaatcaagc aaattgaaat gaaattttaa aacgaaatat 
atgcagacgc aaatgcgttt gtaaaaataa gtggcatttc 
gaacaaagag tttcaaaaag aatgcatgta cagatttggt 
aaagctattc aatttatcgg taccaattta atgattaatg 
agtaaaactt ataaaagcta cctagtagca gtactatgct 
ttctatactt cactacagcg tggtcaattg caggattcgc 
atacttttat gaagaataaa aaaactgcta cttgcgtcaa 
atacaactta attaaatcaa aatatacgga ggtagtcaac 
ttattacact aaagatttct caggatacag aaatgaagaa 
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4761 gataactttg tagcaaatca agaattgaca gtaacaatca cattgaacga gtacagaaaa cttattgaaa 

4831 taaaggctgt taaagataaa gaagaagata cttacagagg taagtattt t gcggaagaaa gaaaaaacga 

4 901 aaaattggaa aaagaaaata taaaactaaa aaacaaaatt tatgaattac aaaacgaaga agataacgag 

4971 gaggacgaag aagacaagga ggacgagaac gatgtattac aaaattggtg agataaaaaa caaaattata 

5041 agctttaacg ggtttgaatt taaagtgtct gtgatgaaga gacatgacgg tatcagtata caaatcaagg 

5111 atatgaataa tgttccactt aaatcgtttc atgtcataga ttcaagcgaa ctatatattg cgacggatgc 

5181 aatgcgtgac gttataaacg aatggattga aaataacaca gatgaacagg acaaactaat taacttagtc 

5251 atgaaatggt aggaggtatg aaaagtgaat gatttacaag agagagaatt agaaacattc gaacaagacg 

5321 accgattcaa agtaactgat ctagacagtg ctaactgggt ttttaagaaa ctggatgcaa tcacaactaa 

S391 agagaatgaa atcaacgatt tagcaaataa agaaattgaa cgcataaacg aatggaaaga taaagaagta 

5461 gaaaaattac agagtggcaa agaatattta caaagccttg taattgaata ttacagaata caaaaagaac 

5531 aagatagcaa attcaagttg aatacacctt acggaaaagt gacagccaga aaaggttcaa aagtcattca 

5601 agttagcaat gagcaagaag tcattaaaca acttgagcaa cgaggttttg acaactatgt aaaagtaact 

5671 aaaaaactta gccaatcaga cattaagaaa gaCttcaatg taactgaaaa cggcacattg attgacgcaa 

5741 acggcgaagt tttagagggt gctagcattg tggagaaacc aacgtcatac acggtaaagg tgggagaata 

5811 gatgactgaa aaaactaatc aagatgtcga tattttaacg caactaggtg taaaagacat cagcaaacaa 

5881 aatgcaaaca agttttataa atttgcgata tacggcaagt tcggtactgg taaaactacg tttttaacaa. 

5951 aagataacaa taccttagta ctagatataa atgaggacgg aacaacggta acagaagatg gggcagttgt 

6021 gcagattaag aattataagc attttagtgc agtgattaaa atgctgccta aaattattga acaactaaga 

6091 gaaaacggaa aacaaattga tgttgtagtg attgaaacaa tccaaaagtt acgtgatatc actatggacg 

6161 acatcatgga cggtaaatca aagaaaccga catttaatga ttggggcgag tgtgctacac gcattgtaag 

6231 tatttatcgt tatatttcta aattacaaga acattatcaa tttcatcttg ctataagcgg acacgagggc 

6301 attaacaaag acaaagatga tgagggaagt actatcaatc caacaatcac gatagaggca caagaccaaa 

6371 taaaaaaagc agtcatcagt caatctgacg tgttagcaag aatgacaata gaagaacatg agcaagacgg 

6441 cgaaaaaact tatcaatatg tacttaacgc tgaaccatca aatttattcg agacaaagat aagacactca 

6511 agcaacatca aaattaacaa caaacgtttc attaatccaa gtattaacga tgttgtacaa gcaattagaa 

6581 atggtaatta aaaattaatt aaaaggacgg tataaaaatt atgaaaatca ctggtagaac acaatacatt 

6651 caagaaacta atcaagaggc attcatgaaa ggtggggact ttttaggagc tggagaattt acagtaaaag 

6721 ttgcaaatgt cgagtttaac gacagagaaa acagatactt cacgattgtt tttgaaaaca acgaaggtaa 

6791 acaatacaaa cacaaccaat tcgtcccacc attccaacaa gattatcaag aaaaacaata tatcgagtta 

6861 cttagtagat taggaattaa attgaactta ccagatttaa cttttgacac agatcaatta attaacaaaa 

6931 tcggaactat tgtacttaaa aataaattta acgaggaaca aggcaagtat tttgtaagac tctcatatgt 

7001 aaaagtttgg aataaagacg atgaagtagt taataaacca gaacctaaaa ctgatgagat gaaacaaaaa 

7071 gaacagcaag caaatggtaa acagacacct atgagtcaac aatcaaaccc attcgctaat gctaacggtc 

7141 caatagaaat caatgatgat gatttaccgt tctaggacgt ggtttaaatg caatacatta caagatacca 

7211 gaaagacaat gacggtactt attccgtcgt tgctactggt gttgaacttg aacaaagtca cattgattta 

72 81 ctagaaaacg gatatccgct aaaagcagaa gtagaggttc cggacaataa aaaactatct atagaacaac 

73 51 gcaaaaaaat attcgcaatg tgtagagata tagaacttca ctggggcgaa ccagtagaat caactagaaa 
7421 attattacaa acagaattgg aaattatgaa aggttatgaa gaaatcagtc tgcgtgactg ttcaatgaaa 
7491 gttgcgagag agttaataga actgattata tcgtttatgt ttcatcatca aatacctatg agtgtagaaa 
7561. cgagtaagtt gttaagcgaa gataaagcgt tattatattg ggctacaatc aaccgcaact gtgtaatatg 
7631 cggaaagcct cacgcagacc tggcacatta tgaagcagtc ggcagaggta tgaacagaaa caagatgaat 
7701 cactacgaca aacatgtgtt agcactgtgt agacaacatc ataatgaaca gcacgcaatt ggtgttaagt 
7771 cgtttgatga taaatatcaa ttgcatgact cgtggataaa agttgatgag aggctcaata aaatgttgaa 
7841 aggagagaaa aatgaataag ttactaatag atgactatcc gatacaagta ttaccgaaat tagctgaatt 
7911 aatagggtta aacgaagcaa tagtattgca acaaattcat tattggctaa acaactcaaa acataaatac 
7981 gatggcaaaa cttggatttt taattcttat ccagaatggc aaaaacaatt tccattttgg agcgagagaa 
8051 ctataaaaag gacatttggg agtttagaaa aacaaaattt attgcatgta ggtaactaca acaaggctgg 
8121 atttgaccgt acaaaatggt attcaatcaa ttatgaaaca ttaaacaaac tagtggcacg accatcggga 
8191 caaaatggcc cgacgatgag gacaaattgg cacgatgcaa gaggacaaaa tgacccgacc aataccatag 
8261 actacacaga gactaacaaa catagagaga cagacgacgt ctcaaagtca tttaagtata ttagtaccaa 
8331 tttagaaatt atacaaaacc ctttaaaagc agaacagtta gaacacgaaa ttaaatcatt taagcaagat 
8401 cagttcgaaa tagtaaaagt cgctaccgat tactgcaaag aaaacaacaa aggtctgaat tacttactaa 
84 71 ctgtattaaa gaactggaat aaagaaggcg tttcagataa agaaagtgct gaaaacaaat tgaaacctcg 
8541 taactctaaa aaagaaacta ctgatgatgt catagcacaa atggaaaaag aattgagtga tgactaatgc 
8611 cgatgagcaa aacacaagca ttagaaatta ttaaaaaagt taggtacgta tacaacatcg attttgataa 
8681 accaaagtta gaaatgtgga ttgatgtatt aagtcaaaac ggggattatc aaccaactgt aaaagctgta 
8751 gatggatata tcaacagtaa caacccgtac ccgcctaacc taccagcaat catgcgtaag gcacctaaaa 
8821 aagtatctat tgagccggta gacaacgaaa ccgctacaca ccaatggaaa atgcagaatg accccgaata 
8891 tgtcagacaa agaaaaatag cgctagataa cttcatgaat aagttggcag aatttggggg cgataacgaa 
8961 tgaattacgg tcaatttgaa attgaaagca caataatcgc tacgctactt aaacaaccgg acgtactaga 
9031 aaagataaga gttaaagatt acatgtttac gaacgaaaag tttaaaacct ttttcaatta tgtaatggac 
9101 gtcggaaaga tagatcatca agaaatctat ttaaaagcaa ctaaagataa agagttttta gatgcagata 
9171 ctataactaa actttacaac tccgatttca ttggatacgg attctttgaa cgttatcaac aagaattatt 
9241 ggaaagttat caaatcaaca aagcgaaaga attggtaact gagttcaaac aacaacctac gaaccaaaat 
9311 tttaataact tgattgatga actcaaggat ttaaaaacaa ttactaacag aaaagaagac ggaaccaaga 
9381 agtttgttga ggagtttgtc gatgagttat acagcgatag ccctaagaag caaattaaga cgggttataa 
9451 gctcatggat tacaaaatag ggggattgga gccgtcgcaa ttaatcgtca tcgcagcgcg tccctcagtg 
9521 ggtaagacag gtcttgcatt aaacatgatg ctgaacatag cacaaaatgg acacaaaaca tctttcttta 
9591 gtctcgaaac aactggcaca tcagtattga aacgtatgtt atcaacaatt actggtattg agttaacaaa 
9661 gataaaagaa atcaggaact taacgccgga tgacttaaca aagttaacga atgcgatgga taaaatcatg 
9731 aaattaggca tcgatatttc tgataaaagt aatatcacac cgcaagatgt gcgagcgcaa gcaatgaggc 
9801 attcagacag gcaacaagtt attttCatag attatcttca actgatggat actgatgcga aagttgatag 
9871 acgtgtagca gtagaaaaga tatcacgtga cttaaagata atcgctaacg agacaggcgc aatcatcgta 
9941 ctactttcac aactgaatcg tggtgtcgag tctagacagg ataaaagacc aatgctatcg gacatgaaag 
10011 aatcaggcgg aatagaagca gatgcgagtt Cagcgatgct actttaccgt gatgattatt ataaccgtga 
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10081 cgaagatgac agtatcactg gcaaatctat tgttgaatgt aacatagcca aaaacaaaga cggcgaaacc 

10151 ggaataattg aatttgagta ttacaagaag actcagaggt ttttcacatg aatataatgc aattcaaaag 

10221 cttattgaaa tcgatgtatg aagagacaaa gcaaagcgac ccgattgtag caaatgtata tatcgagact 

10291 ggttgggcgg tcaatagatt gttggacaat aacgagttat cgcctttcga tgattacgac agagttgaaa 

10361 agaaaatcat gaatgaaatc aactggaaga aaacacacat taaggagtgt taaaaaatgc cgaaagaaaa 

10431 atattactta taccgagaag atggcacgga agatattaag gtcatcaagt ataaagacaa cgtaaatgaa 

10501 gtttattcgc tcacaggagc ccatttcagc gacgaaaaga aaattatgac tgatagtgac ctaaaacgat 

10571 ttaaaggcgc tcacgggctt ctatatgagc aagagctagg attgcaagca acgatatttg atatttagag 

10641 gtggcacaat gagtaaatac aatgctaaga aagttgagta caaaggaatt gtatttgata gcaaagtaga 

10711 gtgcgaatat taccaatatt tagaaagtaa tatgaatggc actaactatg atcgtatcga aatacaaccg 

10781 aaatttgaat tacaacctaa attcgggaaa caaagaccga ttacgtatat agccgatttc tctttgtgga 

10851 aggaagggaa actggttgaa gttatagacg ttaaaggtaa ggggactgaa gttgccaaca tcaaagcgaa 

10921 gatattcaga tatcagtata gagatgtgaa tttaacgtgg atatgtaaag cgcctaaata cacaggtcaa 

10991 gaatggatgg tatatgagga cttagtgaaa gtcagacgta aaagaaaaag agaaatgaag tgatctaatg 

11061 caacaacaag catatataaa cgcaacaatt gatataagaa tacctacaga agttgaatat cagcattacg 

11131 atgatgtgga taaagaaaaa gatacgctgg caaagcgctt agatgacaat ccggacgaat tactaaagta 

11201 tgacaacata acaataagac atgcatatat agaggtggaa taaatgaagt tgaacgaagt attcgcaact 

11271 aatttaaggg taatcatggc tagagataac gtaagtgtcc aagatttgca caatgaaact ggcgtatcaa 

11341 gatcaactat tagtggatat aaaaacggaa aagctgagat ggttaactta aatgtattag ataaattggc 

11411 agatgctcta ggtgttaatg taagtgaact atttactaga aatcacaaca cgcacaaatt agaggattgg 

114 81 attaaaaaag taaatgtata gaggtggaat aaatgagtat cgtaaagatt aacggtaaac catataaatt 

11551 taccgaacat gaaaatgaat tgataaaaaa gaacggttta actccaggaa tggttgcaaa aagagtacga 

11621 ggtggctggg cgttgttaga agccttacat gcaccttatg gtatgcgctt agctgagtat aaagaaattg 

11691 tgttatccaa aatcatggag cgagagagca aagagcgtga aatggttagg caacgacgta aagaggctga 

11761 actacgtaag aagaagccac atttgtttaa tgtgcctcaa aaacattctc gtgatccgta ctggttcgat 

11831 gtcacttata accaaatgtt caagaaatgg agtgaagcat aatgagcata atcagtaaca gaaaagtaga 

11901 tatgaacaaa acgcaagaca atgttaaaca accggcgcat tacacatacg gcaacattga aattatagat 

11971 tttatcgaac aggttacggc acagtatcca cctcaactag cattcgcaat aggtaatgca atcaaatact 

12041 tgtctagagc accgttaaag aatggtcatg aggatttagc aaaggcgaag ttttacgtcc aaagagcttt 

12111 tgacttgtgg gagggttaac gatggcaacg caaaaacaag ttgattacgt aatgtcatta caggaacaat 

12181 tgggattaga agactgtgaa aaatatacag acgaacaagt taaagctatg agtcataaag aagttagcaa 

12251 tgtgattgaa aactataaga caagcatatg ggatgaagag ctatataacg aatg.catgtc gtttggtctg 

12321 cctaattgtt aaaaggagtg atgaccatga acgatagcgc acgcaaagaa tacttaaacc aatttttcag 

12391 ctctaagaga tatctgtatc aagacaacga gcgagtggca catatccatg tagtaaatgg cacttattac 

12461 tttcacggac attataaaac gatgtttaaa ggcgtgaaaa agacatttga tactgctgaa gagctcgaaa 

12531 tatatataaa gcaacatgat ttggaatatg aggaacagaa gcaaccaact ttattttaga ggagatggaa 

12601 ataatggcaa agattaaaag aaaaaagaag atgacgctac tcgaactggt ggaatgggca tggaacaatc 

12671 ctgaacaagt tgaaagtaaa gtgtttcaat cagatagaat gggcacgctt ggagaatgta gcgaagtaca 

1'2741 cttttcaact gatgggcatg ggttttatac aaaagtagta acagataaag atatttttac tgtagaaatc 

12811 acagaggaag tcactgaaga tactgagttt gattgtctag tagaactaaa cgatattgaa ggttttgaaa 

12881 tatatgaaaa tgattcaatc agagagttga tagacggtac ttccagagcg ttttatatac taaacgaaga 

12951 taaaactatg acattaattt ggaaagatgg ggagttggta gtatgatgca aacctataaa gtatgtcttt 

13021 gtatcaagtt ctttgcatct aaatgtgatt ataaattaaa gaaacattat ttcgtgaaaa gtacgaatga 

13091 ggaaaaagcc acgaacatgg tattaaaact gattcgtaaa aagctcccgt tcgaaactgc aagcatagaa 

13161 gtcgaaaaag tggaggcaat ataatgatac aaccaacaag agaagaatta attaatttca tgaaaaaaca 

13231 tggagctgaa aatgttgact ctatcactga tgagcaaagt gcaataagac actttagagc tcaatcaaaa 

13301 gtttttaaag acgaacgtga tgagtacaag aagcaacgag atgagcttat cgaggatata gctaagttaa 

13371 gaaaacgtaa cgaagagctg gagaacatgt ggcgcacagt caaaaatgaa ttgcttggaa gatacgaaca 

13441 ttactgtttt aaaattagag aactacaccc tgagagcaaa gcgaacagga taggagctct ctatatagga 

13511 ggtaaaagca ctgcagatat tatactgtcg cgaatggaag aactagacgg aacaaatgag ttctacgaat 

13581 ttttagggca aatggaggca gacacaaatg aataaccgtg aacaaataga acaatcagtg atcagtacta 

13651 gtgcgtataa cggtaatgac acagaggggt tactaaaaga gattgaggac gtgtataaga aagcgcaagc 

13721 gtttgatgaa atacttgagg gaatgacaaa tgctattcaa cattcagtta aagaaggtat tgaacttgat 

13 791 gaagcagtag gggttatggc aggtcaagtt gtctataaat atgaggagga gcaggaaaat gagtattagt 
13861 gtaggagata aagtatataa ccatgaaaca aacgaaagtc tagagattgt gcaattggtc ggagatatta 
13931 gagatacaca ttataaactg tctgatgatt cagttattag cattatagat tttattacta aaccaattta 
14001 tccaattaag ggggacgagt gagtggaatg gaaacgatta aaaaatgtgg tgccgcaccc agttatcaaa 
14071 aataaaaatt taaagtcggt atacgtaaca aaagataatg tgaaagaggt tcaaaaagaa ttaggtttct 
14141 ttgaaatttt taatgaagaa gtgttattaa ctggattttt atcatttcaa aggataccta cttacattat 
14211 ttggattaat cctaaatctc ataagacgcc tagatattac tttgctaacg agcatgagat tgaaagatat 
14281 tttgaatttt tggaggacga gtaaatgctt gaaatcatcg accaacgtga tgcattgcta gaagaaaagt 

143 51 atttaaacga cgactggtgg tacgagctag attattggtt gaataaacgc aagtcagaaa atgaacagat 

144 21 tgatattgat agagtgctta aatttattga ggaattaaaa cgataggaga caacgaataa atgaataatt 
144 91 caacagtaga tcaattaaaa gaacttttac aaatacaaaa ggagttcgac gatagaatac cgactagaaa 
14561 tttaaatgac acagtagcta gtatgattat tgaatttgcg gagtgggtta acacacttga gttttttaaa 
14631 aattggaaga aacaaccagg taagccatta gatacacaat tagatgagat tgctgattac ttagctttca 
14701 gtttgcaatt aactctgact attgttgatg aagaagattt ggaagagact actgaggtta tggttgattt 
14771 gactgaaaat gaagttactt tacctaaact acattcagtt tattttgttc atgtaatgca tacactaaca 
14841 gaacaatttg taaaaggtat tgataatagt attgtacaag ttttaataat gccttttttg tacgccaata 

14 911 cttactatac aatcgaccaa ctcattgacg catacaaaaa gaaaatgaaa aggaaccacg aaagacaaga 
14981 tggaacagca gacgcaggaa aaggatacgt gtaaagacat cttagatcga gtcaaggagg ttttggggaa 
150S1 gtgacgcaat acttagtcac aacattcaaa gactcaacag gacaaccaca tgaacatttt actgctgcta 
15121 gagataatca gacgtttaca gttgttgagg cggagagtaa agaaggagcg aaagagaagt acgagaaaca 
15191 agttaagata aggagagatg gagatgccaa agaaaacggt aacgattgat gtagatgaaa acttattagt 
15261 agcagctagt aatgaaatat cagaactatt atatgaatat gacagtgagt taatgtcagc tgatgaagat 
15331 ggcgataata gagatatcga aaaaaaaaga gacgcattaa aacaagctat acaaattatc gataaattaa 
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15401 catgtcgagg aggcagacga tgattaacat acctaaaatg aaattcccga aaaagtacac tgaaataatc 

15471 aagaaatata aaaataaaac acctgaagaa aaagctaaga ttgaagatga tttcattaaa gaaattaatg 

1S541 ataaagacag tgaattttac agtcctatga tggctaatat gaatgaacat gaattaaggg ctatgttaag 

15611 aatgatgcct agtttaattg atactggaga tggcaatgat gattaaaaaa cttaaaaata tggattggtt 

15681 cgatatcttt attgctggaa tactgcgatt attcggcgta atcgcactga tgcttgttgt catatcgcct 

15 751 atctatacag tggctagtta ccaaaacaaa gaagtatatc aagggacaat tacagataaa tataacaaga 

15 821 gacaagataa agaagacaag ttctatattg tgttagacaa caagcaagtc atcgaaaact ctgacttact 
15891 actcaaaaag aaatttgata gcgcagacat acaagctagg ttaaaagtag gcgacaaagt agaagttaaa 
15961 acgattggtt atagaataca ctttttaaat ttatatccgg tcttatacga agtaaagaag gtagataaat 
16031 aatgattaaa caaatattaa gactattatt cttactagcg atgtatgagc taggtaagta tgtaactgag 
16101 aaagtatata ttatgacgac ggctaatgat gatgtagagg cgccgagtga cttcgcaaag ttgagcgatc 
16171 agtctgattt gatgagggcg gaggtgtcag agtagatgta tagcaaagag tcaattgtta atatgatagg 
16241 cacacataaa atgaagtgta atgtattagc tgatgtaata ccggaatatg atagcaattc aattgcacag 
16311 tatggcatac aagcaacgtt gccgaaacca caaggggaaa actcaagtaa agttgaagat gttgttgtga 
16381 ggcttgagag agcaaataaa aggtatgctc agatgttaaa agaggttgag tttataaatc aatcgcaaca 
164S1 gagattggga cacgttgact tttgcttctt agagttattg aagaaaggtt ataacaggga tgcgattatc 

16 521 aagaagatgc ctaactctaa attaaataga aacaacttct tagcgcgccg tgatgagtta gcagaaaaga 
16 591 tttatctact acagtgacga aaatgacaaa aatgacagaa atgacgaaaa tgacactatt tttaaactgt 
16661 gaattaattt tatataattg atttgtaaga attatcttaa gacgtggggt aatagccaca ttagatgttc 
16731 tcatcgatgt gattgagaag tgacaaacat ataaaagatg atatgttacg ctattaatca cctactacct 
16801 gcctatatgg tgggtagttt aattcttgca ttttgagtca taactatttt cctcctttca catttattga 
16871 acgtagctcc tgcacaagat gtaggggcat tttttatatc taaataacta gagtaattaa cgtaaaggcg 
16941 tgtgatacag tgaaaacaat tgattaaatt aacaccgaag caagaaaagt ttgtgctagg actcatagag 
17011 ggcaagagcc aacggaaagc atatattgac gcagggtatt cgactaaagg taagagtggg gaatatctag 
17081 ataaagaagc gagtacactt tttaaaaatc ggaaggtttc cggaaggtac gaaaaattgc gtcaagaagt 
17151 agctgaacaa. tcaaaatgga cacgccaaaa ggcctttgaa gaatatgagt ggctaaagaa tgtagctaag 
17221 aatgacattg aaatagaggg agtgaagaaa gcgacagctg atgcattcct cgctagttta gatggtatga 
17291 atagaatgac gttaggtaac gaagttttag ctaaaaagaa aatagaaact gaaattaaga tgcttgagaa 
17361 gaagattgaa caaatagata aaggtgacag tggaacagaa gataaaatca aacaacttca cgacgcaata 
17431 acggaagtga tcgtcaatga ataaacttaa atctttatat acggacaaac aaattgaaat attgaagcaa 
17501 acgcaaaaac aagattggtt tatgttaatt aatcacggag caaagcgtac aggtaaaaca atattaaaca 
17571 atgacttatt tttacgtgag ttaatgcgtg tgcgaaagat agcagacgaa gaaggaattg agacacctca 
17641 atatatactt gctggtgcaa cattaggtac gattcaaaaa aacgtactaa tagagttaac taacaaatat 
17711 ggcattgagt ttaattttga taaatataat tcattcatgt tatttggcgt tcaagtggtt cagacaggtc 
17781 acagtaaagt aagtggtata ggagctatac gtggtatgac atcgtttggt gcatatatca atgaagcgtc 
17851 gttagcgcat gaagaggtgt ttgacgagat taagtcacgt tgtagtggaa ctggtgcaag aatattggta 
17921 gataccaacc ctgaccatcc cgagcattgg ttgttgaaag attatattga aaatacagat cctaaagcag 
17991 gtatactgag tcaccaattt aagctcgatg acaataactt tcttaatgat agatataaag agtctattaa 
18061 ggcttcaaca ccatcaggta tgttctatga acgtaatatc aacggtatgt gggtgtctgg tgacggtgta 
18131 gtatatgccg actttgattt gaatgagaat acgattaaag cagatgaact ggacgacata cctatcaaag 
18201 aatactttgc tggtgtcgac tggggttacg agcactatgg atctattgtg ttaataggac gaggtataga 
18271 tggtaacttt tattttattg aggagcacgc acaccaattt aagtttattg atgattgggt ggttattgca 
18341 aaagatattg taagtagata tggcaatatt aatttttact gcgatactgc acgacctgaa tacatcactg 
18411 aatttagaag acatagatta cgtgcaatta acgctgataa aagtaaacta tcgggtgtgg aggaagttgc 
18481 taagttgttc aaacaaaaca agttacttgt tctttatgat aatatggata ggtttaagca agaggtattt 
18551 aaatatgttt ggcaccctac aaacggagag cctataaaag aatttgatga cgtgttggac tcgttaagat 
18621 atgccatata cacacatact aaacctgaac gattaaggag ggggaaatga cattgtataa gttaatagat 
18691 gatattgaag cacaaggaat attgcctaag catattgagg ctcCaataga gtcacataaa gacgatagag 
18761 agagaatggt taatctctat aatagataca agacacatat tgactatgta ccaatattca aacgtcgacc 
18831 aattgaagaa aaagaagatt ttgaaactgg tggaaatgta aggcgattag acgtgtctgt taataacaaa 
18901 cttaacaact cttttgacag cgaaattgtt gatacacgtg ttggttattt acatggtgtt cctgttactt 
18971 atgatttaga tgaaaacgca gaaaaaaacg aaaagttgaa aaagtttata accaactttg ccattagaaa 
19041 tagtgtcgat gatgaggatt ctgaaatagg taaaatggca gcaatttgcg gatatggtgc taggttagca 
19111 tatattgata cgaatggtga tattaggatt aagaatatag acccctataa tgttattttt gttggcgaca 
19181 atattttaga acctacatac tcattgcgct acttttatga aaaagatgat gataatggca ctgattatgt 
192S1 gtacgcagag ttttacgata atgcttatta ttatgtattt cgaggagaag gtattgacgc tttgcaagaa 
19321 gttggacgat atgaacattt atttgattac aatccattgt ttggtgtacc taacaacaaa gagatgatag 
19391 gagatgctga aaaggttatt cacttaattg acgcatatga tttaacaatg agcgatgcat caagtgagat 
19461 tagtcagaca cgtttagcat accttgtgtt acgcggtatg ggtatgagtg aagaaatgat tcaagaaaca 
19531 caaaagagtg gcgcatttga gttgttcgac aaagatatgg acgttaaata cttaacaaaa gatgtaaatg 
19601 acacaatgat tgagaaccat ttagatcgaa tcgaaaagaa tatcatgcgt tttgcaaagt cagtaaactt 
19671 taattctgac gagtttaacg gaaatgtacc tatcattgga atgaaactta aacttatggc tttagagaac 
19741 aagtgtatga cgtttgagcg taagatgaca gctatgttga ggtatcaatt caaagttatt ttatctgcat 
19811 taaagcgtaa agggtacaac ttggatgatg atagttattt aaacctgata tttaagttca ctcgtaacat 
19881 tccagttaat aagttagaag aatcacaagt gctaattaac ctgaagggac aagtttcaga acgaacaagg 
19951 ttaggacaat cacaactagt tgatgatgtt gattacgaat tagacgaaat ggaaaaagaa agtcttgaat 
20021 tcaatgacaa attacctgac atagatgaag gtgacgcaaa tgacaaatcc caaaataacc aatcagaatg 
20091 atattgatga gtatatcgag ggtttaatct ctaaagcaga aaaaccaata gaacaactat ttgctaatcg 
20161 acttaaagag ataaaacaaa tcatcgcaga tatgtttgag aaatatcaaa atgatgatgt gtatgttaca 
20231 tggactgaat tcaataaata caacaggctc aataaggagt taactcgtat aggtacaatg ttgacttatg 
20301 actataggca agtagctaag atgattcaga agtcacaaga agatgcttat atagaaaaat tccttatgag 
20371 cctttattta tatgaaatgg cgagtcaaac atctatgcag tctgatgttc cgagtaaaga ggtaatcaaa 
20441 tcagctactg aacaacctat tgagttcatt cgtttaatgc caacactaca aaaacatcgt gatgaagtat 
20511 tgaaaaagat acgtatgcac attacacaag gtattatgag tggagagggt tactctaaga tagctaaagc 
20S81 aatacgtgat gatgtcggca tgtctaaagc tcaatcattg cgtgtggctc gtacagaagc aggcagagca 
20651 acgtcacaag ctggacttga tagcgcaatg gttgctaaag ataacggttt gaatatgaag aaacgttggc 
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20721 atgctactaa agatacacga acacgtgata ctcatcgtca tttagatggg gaatcagtgg aaatagatca 

20791 gaattttaaa tcaagtgggt gtgttgggca ggcgcccaag ctatttattg gtgtaaacag tgcgaaagag 

20861 aatattaatt gtcgttgcaa attactttat tatattgatg aaaatgaatt gccaactgta atgagagcac 

20931 gtaaagacga tggtaaaaat gaagttatcc cattcatgac ttatcgtgag tgggagaaat ataagcgaaa 

21001 . aggtggtaat tgatatggat tttaaaataa aagtaaatgt tgatactggc gaagctatag aaaagttaga 

21071 acgcattaaa tccttgtacg aagagataat agagttacaa aacgaaaaag ttgttgtaaa cgtaacagtt 

21141 aaaaatgaag ctgatttaga tatggttaaa acatctatta gcgaagaaaa tgctaaaaac aatgatttca 

21211 cactttttta gttgtctctt tgctactcga ccttagcatg tcgttaaact gctttttatt atgcactttt 

21281 cggactgtta gggtacgcga agggcaaaaa ggagttttga tatatgaata tcgaagaagt taagtctttt 

21351 tttgaagaac acaaagacga taaagaagta aaagattatc taaagggact taagacggtg tctgttgatg 

21421 acgttaaagg ctttttagat acagaagaag gtaaacgatt cattcaacct gaattagatc gttatcattc 

21491 gaaaggatta gaatcatgga aagagaaaaa tcttgaggat ctaatcgaac aagaagtacg gaagcgtaat 

21561 cctgagcaat cagaagaaca aaaacgtatt agtgctcttg aacaagagtt agaaaaacgc gacgcagagg 

21631 caaaacgtga gaagttaaga agtaacgcgc taggtaaagc gcaggaacta aatttaccaa catccttagt 

21701 tgatagattt ttaggcgatt ctgatgaaga tactgagcaa aacttaaaag ctttaaaaga aacctttgac 

21771 aagtatgttc aaaaaggcgt tgagtctaaa tttaaatcga gtggaagaga tgttaaagaa tcacgaaatc 

21841 aagatttaga cccttcaaat gtaaagtcca ttgaagaaat ggcgaaagaa atcaatatta gaaaataaag 

21911 tgaggtaata aaatatggca actccaacat acacgccagg caatgttatt ttatcggatt ttaaaaacgg 

21981 cgttattcca gcagaacaag gtactttaat catgaaagac attatggcta attcagcaat tatgaaatta 

22051 gctaaaaatg agccaatgac agcacaaaag aaaaaattta cttacttagc aaaaggtgta ggcgcctact 

22121 gggtatcaga aacggaacgt attcaaactt ctaagcctga atatgcgcaa gcagaaatgg aagctaagaa 

22191 aattggtgta attattccgt tatcaaaaga gtttcttaaa tggactgcaa aagatttctt taatgaggtt 

22261 aaacctctaa ttgcagaggc attttacaaa gcgtttgacc aagctgttat ctttggtact aaatcacctt 

22331 acaacacttc aactagtggt aaaccgcttg ttgaaggcgc agaagagaaa ggtaacgttg ttacagatac 

22401 taataattta tacgtagacc tttcggcact aatggctact attgaagatg aagagttaga tccaaacgga 

22471 gtattaacca cacgttcatt cagaagtaaa atgcgtaatg ctttagatgc caatgacaga ccattatttg 

22541 atgctaacgg gaacgagatt atgggattac cactatctta tactggagcg gatgtatacg acaaaaagaa 

22611 atcgttagca ctaatgggtg attgggatta cgcacgttac ggtatcttac aaggtattga gtatgcaatt 

22681 tctgaagatg ccacgttaac gacgttacaa gcatcagatg cttctggcca accagtatca ttatttgaac 

22751 gtgatatgtt cgctttacgt gcgacgatgc atattgcata catgaacgtt aaaccagaag cgttcgcaac 

22821 gcttaaacca actgaatagg aggagatatg atggctaatc ctgcagaaga gattaaggta aaaaaagaca 

22891 atatgactat tactgttaca aagaaggcat ttgactctta ttacagtctt gtcggttaca aagaggttaa 

22961 atcacgtcgt actacgtctg ataagagcga gtgataaaaa tgactcttta tgaagatgtt aaacttttac 

23031 tcaagaaaaa tggagtggaa gttaaaagtg atgaagaaga aatatttaag atggaagttg acggaatact 

23101 agaagatgtt agggatataa caaacaatga ttttatgaaa gatggtcaag tcatttatcc ttactcaatc 

23171 aaaaagtatg tcgcagatgt cctagagtat tatcaacgac ctgaagttaa aaagaattta aagtcaagaa 

23241 gtatggggac agtgtcgtac acttataacg atggtgtccc tgattacatt agtggagtat taaacaggta 

23311 taaacgagca aagtttcatc cgtttaaacc aataaggtag aggtgttgtt tgtgtttaac ccatacgacg 

23381 aattccctca cactatttct attggaagta tcaaaaaagt aggagagtat ccaattatac aagagcgctt 

23451 tgtaagcgat aaaacaatta aaggatttat ggatacgcct actacatctg aacaactaaa atttcatcaa 

23521 atgtcacaag aatatgacag aaacctatat gtaccttatg acttgccaaC atctaaaaac aatttatttg 

23591 agtatgaggg tagaatcttt agtattgaag gtgattctgt agatcagggc ggacaacatg aaattaagtt 

23661 actacgactt aagcaggtgc catatggcaa aagttaagta cggtgctgat agcatggttg ttgaattgga 

23731 taagttcgat aagaaaatag aagagtgggt taaaaaaggt attgctaaaa caacgacgaa gatttacaac 

23801 actgctgtag cattagctcc tgttgactta ggttttttag aagaaagtat tgactttaaa tatttcgatg 

2 3871 gtgggttatc cagtgttata agtgtcggcg cagattatgc aatatacgtt gaatacggta ctggtatata 

23941 tgctactggt cctggtggta gtcgtgctac aaagattccg tggagtttta aaggtgatga cggcgaatgg 

24011 tacaccacat atggtcaagc gccacagcca ttttggaacc ctgcaattga cgcaggacgc aagacattcg 

24081 agcagtattt ttcatagagg tggttaaata tgtgggtatc agttgagcct gaacttacaa atcaaatata 

24151 taaaagatta atctcagacc ctaacattaa caaactagtt gatgataggg tttttgacgc tgttcaagat 

24221 gacgctgttt acccatatat tgttgtgggt gaatcaaacg tcactaacaa cgaatctagc gcaacaatga 

24291 gagaaacagt cggtattgtc atacatgtgt attcacagct cgctacacaa tacgaggcta agctcatttt 

24361 aagcgcgata ggttatgtgc ttaacagacc tatagaaata gataattacg agtttcaatc tagccgtatc 

24431 gatagtcaag cagtattccc tgatatagac aggtttacta agcatggcac gatacggctc ttatttaagt 

2 4501 acagacataa aaagaaaaac gaaggagtgt attaaatggc gcaaaaaaac tatttagcag ttgtacgtcc 

24571 agctgaaact gacttagatc cagtagaatc tttattatta gctgacttac aagaaggtgg acatacgatt 

24641 gaaaatgatt tagctgaaat agtacgaggc ggtaaaacgg actattctcc caatgcaatg tcagaatcat 

2 4711 ttaaattaac aattggtaat gtgcctggag ataaaggaat tgaagcagtg aaacacgctg tacaaacagg 

24781 tggacagttg cgtatatggc tttatgagcg taataaacgt gcagacggta aacatcacgg aatgtttggt 

24851 tatgttgttc cagaatcatt tgaaatgtca tttgatgatg aaagtgacaa aatcgaacta tcattaaaag 

24921 ttaaatggaa tacagcagaa ggtgctgaag ataacttgcc gaaagagtgg tttgaagctg caggtgcgcc 

24991 tacagtcgaa tacgaaaaat tcggcgaaaa agtcggaaca ctcgagaatc aaaagaaagc cagtgttgta 

25061 tctgattcac acacggaaga ccattctatg taaactaata gatcaagggg gcgtaagctc cctatctttt 

25131 cataaaaaaa ttgaaaagag gtatatattt tgactgaatt taatccaatt acaacattaa aaattaatga 

25201 cggagaaaaa gattacgaag tagaagcaaa agtaacattt gcatttgacc gaaaagctga aaaattctca 

25271 gaagatagcg aagatgggag aaaaggagca atgccaggat Ccaatgttat ctttaacggt ttgctagaat 

25341 ctagaaacaa agcgatttta caattttggg aatgtgctac tgcttattta aaaaacccac caactcgaga 

25411 acaattagaa aaagcaattg atgatttcat cactgaaaac gaggatactt tgccgttatt acaaggggct 

25481 tcggacaaac ttaacaatag tggttttttc aagagggaga gtcgctcgta ctggatgaca ttgaacaaag 

25551 caccgaatat- ggccaaaagc gaggacaaag aaatgacgaa agcaggcata gaaatgatga aagagaatta 

25621 caaggaaatc atgggcgcag aaccttacac gattactcaa aaataaggca actgacagct agatatttag 

25691 gatatatccc tgaacatgaa ttgttagcac taacacctgc tgaatggcgt gattggctta ttggtggtca 

25761 ggataggtac ctagatcaaa gacaattatt aattgaacaa gcgcaagcta acggcttagt acaagcttct 

25831 aagaggctaa ctagtatgat tcgtgacatt gagaaacaac gttacgaaat aagagaacct ggtagctatg 

25901 ctcgtgtaca aaaagctaga ttagaagaag aaaaaagaag acgtgaactc ttcaaagaag gtacaagaaa 

25971 attccttgaa tcgaaaggag gttagccttt ggatactcat ttcatggcaa agattatggc caatattaga 
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26041 gatttccaaa gcaacgtaag gaaagctcaa cgattagcaa agacgtctgt accaaacgaa attgaaacag 

26111 atgtaaaagc agatatttca agattccaaa gagctttaca acgcgctaaa tcaatggctc aacgatggcg 

26181 agagcattct gttaaattat tcatgaaaac agatgagtat aaagcgaatt tagaacgcgc taaagctcaa 

26251 gtagagcgat ttaaacaaca taaagtagat ttgaaactaa gtaacactga attaatggcc aaatataatg 

26321 caactaaagc tactgtcgaa gcttggagaa aacatgttgt taagttggat ttagatgcaa accccgctaa 

26391 aatggcggtt aaagggttta aagaagattt aatagatctt agcaggcata gttttgatat tgattccagc 

26461 agatggaaat taggaaataa attcacaaaa gaattcaatg aagtcgaagg agcagttaaa cgttctttcg 

26S31 gaagaattgg tcagattatg agaaaagaag taaatggaac aagtgatatt tggggtaaac ttaacaactc 

26601 attgaaagat " tacggcgaga aaatggacgc cttagctact aaaatccgaa ctttcggtac tatcttcgcg 

26671 caacaggtca aaggcttaat gattgctagt atacaagcat tgataccagt gattgccgga ttagtacctg 

26741 caataatggc agtacttaat gcggttggtg tattaggtgg tggcgtttta ggtttagttg gcgcattctc 

26811 tgtcgcaggt cttggagttg ttggctttgg tgcaatggct attagcgctc ttaaaatggt tgaagatgga 

26881 acattggcag taacaaaaga agttcaaaac tttagagatg cgagcgatca gttaaaaact acatggcgtg 

26951 atattgttaa agagaatcaa gcaagtatct ttaatgcgat gtcagcaggt atcagaggcg ttacaagtgc 

27021 gatgtctcaa ttaaaaccat tcttatccga agtatctatg ctagttgaag caaacgcacg cgagtttgag 

27091 aattgggtta aacattccga aacagctaag aaagcgtttg aagcattgaa tagcataggt ggcgcaatct 

27161 tcggagattt attgaacgct gcaggacgat ttggcgacgg attagttaac attttcactc aattaatgcc 

27231 gttgttcaaa tttgtgtctc aaggactaca gaacatgtct atagctttcc aaaattgggc taatagtgta 

27301 gctggtcaga atgctattaa agcgtttatt gactacacta ccactaactt acctaagatt ggtcagatat 

27371 ttggtaatgt gttcgctggt attggtaatt taatgattgc ttttgcacaa aacagttcca acatttttga 

27441 ttggttggtt aaattaactt ctcaatttag agcatggtca gaacaagtag gacaatcaca agggtttaaa 

27511 gactttatca gttatgttca agagaatggt cctactatta tgcagttaat cggtaatatc gtaaaagcat 

27581 tagttgcttt tggtactgca atggctccta tagctagtaa attgttagac tttatcacta atctagctgg 

27651 atttatcgct aaactattcg aaacacaccc agctatagca caagttgctg gcgttatggg tattttaggc 

27721 ggtgtatttt gggctttaat ggctccgatt gttgctataa gtagtgtact tacaaatgtg tttggtttga 

27791 . gcttattcag cgtcactgaa aagattttag acttcgttag aacatcaagt ttagttactg gagctacgga 

27861 agcattaata ggtgcattcg gttcgatttc agcacctatt ttagcagttg ttgcagtaat tggtgcattc 

27931 attggtgtcc tcgtttattt atggaaaaca aacgagaact ttagaaatac tattactgaa gcgtggaacg 

28001 gtgttaaaac ggcagtttct ggtgcgattc aaggtgtagt cggctggtta actgaattgt ggggcaaaat 

28071 ccaatctacc ttacaaccga taatgcctat attgcaagta ttaggacaaa tattcatgca agttttaggt 

28141 gttttggtaa taggcatcat tacaaacgtt atgaatatca tacaaggttt gtggacttta attacaattg 

28211 cgttccaagc cataggaaca gtgatatccg tagcagtcca aatcatagta ggtttgttca ctgctttaat 

28281 tcagttgctt actggcgact tctcaggtgc ttgggagact attaaaacta cggttaccaa tgtgcttgat 

28351 acgatttggc aatacatgca atcagtttgg gagtcaatta tcggcttttt aactggcgta atgaatcgaa 

28421 cactttctat gtttggtaca agttggtcac agatatggag tacaatcact aattttgtta gcagtatttg 

28491 gaacactgtt acaagttggt tcagtcgagt ggcttcgagt gtagctgaaa aaatggggca agcactaaac 

28561 tttattatca caaaaggttc tgaatgggtt tctaacattt ggaatacagt tacaagtttc gcgagtaaag 

28631 tagctgatgg gtttaaaaga gttgtctcaa atgtaggtga cggtatgagt gatgcacttg gtaagattaa 

28701 aagtttcttc agtgatttct taaatgccgg agcggaatta atcggcaaag tagctgaggg tgtagccaaa, 

28771 tctgcgcaca aagtagtcag cgcggtaggc gatgcgattt catcagcttg ggactctgta acttcattcg 

2 8841 taagtggaca cggtggaggt agtagcttag gtaaaggttt agcggtatca caagcaaaag taattgctac 

28911 agactttggc agtgccttta ataaagagct atcctctact ttgacagata gtatagtaaa tcctgtaagt 

28981 acttctatag acagacacat gactagcgat gttcaacata gcttaaaaga aaataataga cctattgtga 

29051 atgtaacgat tagaaatgag ggcgaccttg atttaattaa atcacgcatt gatgacatga acgctataga 

29121 cggaagtttc aacttattat aagggaggtt tgttagttga tagcgcacga tatagaagta ataaggaatg 

29191 gttcacagta tcgcgtcagt gacaatcctt tcacttataa tcacttggaa gtagttgaat ataacgttac 

29261 aggcgcagga tatcatcgta actattctga tatagagggt attgatggta gatttcataa ttacgctaaa 

29331 gaagaactta aaaaagtaga gcttaagata aggtataaag tacctaaaat tgcttatgct tcacatttaa 

29401 agtcagacgt ccaagcacta tttgctggac gtttttattt aagggaatta gctacaccag acaattcaat 

29471 taagtatgag catatattag atataccaaa agacaaacaa gcatttgagc ttgattatgt tgatggacga 

29541 caactctttg taggactagt aagtgaagtt tcttttgaca caacacaaac atcaggggaa ttttctttgt 

2 9611 cgtttgaaac aaccgaacta ccatactttg aaagtgtcgg ttatagtact gatcttgaaa gtaataacga 

29681 ccctgaaaaa tggtcggtac ctgatagatt gcctacaaac gaaggtgata agaggcgtca aatgacattt 

29751 tacaacacta actcaggaga agtttattat aacggtgatg ttcctttaac acagtttaat cagtttaatg 

29821 ttgttgaaat agagttagct gaagatgtta aagctaatga taaggatgga ttcactttct atacagataa 

29891 aggaaatatc tcagttatta aggaagttga tttaaaagcc ggagataaaa taatcttcga cggtaaacat 

29961 acctatagag gttatttaaa tatagattct tttaataaaa ctttagaaca accggtttta tatccaggct 

30031 ggaatcgatt caagtctaat aaagtaatga aacaaattac atttagacac aaattatatt ttagataagg 

30101 agtagcctat gccaatttta ttaaaaagtc tacagggtgt agggcacgct attaatgtta gtacaaaggt 

30171 aagtaaaaag ctaaatgaag atagttcttt ggatctaact attatcgaga acgcgagtac gtttgacgca 

30241 ataggtgcta taactaaaat gtggacgatc actcatgttg aaggtgaaga tgatttcaac gaatatgtaa 

30311 ttgtcatact tgataagtct actattggcg aaaaaataag gcttgatatc aaagctaggc aaaaagaact 

30381 tgatgacctt aacaattcta ggatttacca agagtataac gaaagtttta caggcgttga gttcttcaat 

30451 actgtcttta aaggaacggg ttataagtat gtattacatc caaaagtaga tgcatctaaa Ctcgagggat 

30521 taggcaaagg agatacacga ttagaaatct ttaaaaaagg acttgagcgt tatcatctcg aatatgaata 

30591 cgatgcaaag actaaaacgt ttcatttgta tgatgaatta tcfcaagtttg ccaattatta cattaaagct 

30661 ggtgtgaatg ctgataacgt caaaatacaa gaagatgcat ctaaatgtta tacctttatt aaaggttatg 

30731 gtgattttga tggacaacag acttttgcag aagcgggact acaaattgaa ttcactcatc cattagcaca 

30801 attgataggt aaaagagaag cgccaccgct tgttgatgga cgtattaaaa aagaagatag tttaaaaaaa 

30871 gcaatggagt tattgataaa gaaaagtgtc actgcttcta tttccttaga ctttgtagcg ttacgtgaac 

30941 atttcccaga agctaaccct aaaataggtg atgttgttag agtggtggac tctgccatag gatataacga 

31011 cttagtgaga atagtcgaaa tcactacaca tagagatgcg tacaataata tcactaagca agatgtagta 

31081 ttaggagact ttacaaggcg taatcgttat aacaaagcag ttcatgatgc tgcaaattat gttaaaagcg 

31151 taaaatctac aaaacccgac ccatctaaag aactaaaagc actaaacgca aaagttaacg caagtttatc 

31221 tataaataat gaattggtta agcagaatga aaaaataaac gctaaagtcg ataagatgaa cacta'aaaca 

31291 gttacaactg ctaatggtac gatcatgtac gactttacta gtcaatcaag tataagaaac atcaaatcaa 
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31361 ttggaacgat tggcgactct gtagctagag 

31431 gaaattgaaa gctaaaacga ctaatcttgc 

31501 gaagcggtag aaaacagcat ttatagacaa 

31571 ctgatgatga ctggttacac ggttattggg 

31641 gttttacggt gccttttgtt ctgcaattga 

31711 atgacagcta caagacaatg ccctatgagt 

31781 tagggttaac acttgaggac tatgtaaacg 

31851 tgacgcatat cacacagatt actttaagcc 

31921 ttacacccta acgaaaaagg tcacgaggtt 

31991 actaaaggag gcaaccaatg gcttacggat 

32061 tgctcaacat gagtataact atcgcttgtt 

32131 catcaaaaag aagaaatata cgcacactca 

32201 atttaacgta tttaaatagc cgttttagca 

32271 aaaagacgcg cgtattgata atacaggtta 

32341 tcaacactag atgctttcac taaaaaggtt 

32411 cagaataccg attcgaacca aaagagcaag 

32481 agtaatgcaa tcattttggg tagaccctag 

32551 cattacatgt tatctagatt gaagcccaac 

32621 acggtacaca caatgcgtat agatacattg 

32691 caaaaacaac aagtttgtac gtttccaata 

32761 gtcatgccga atatatttaa cgacagatat 

32831 tcagacgtga atataaagct tctgaaagac 

32901 tgacgatatt gataaaggta tagacaaagt 

32971 acacaaccta tgcaaggtat cacttatgat 

33041 ccaaccctaa ctacttacaa ggtttcgata 

33111 . tggcggtgtg aataataact ttaaaggaga 

33181 gaaacaggac gtaaagcact tttaataggg 

332 51 attctatcgg ccaaagaggt gttaaccaat 

33321 aggtggacgt gttaaaccgt taccaataca 

33391 tactatatct atacgcaaga cacacaaaat 

33461 ggtggttctt ggatgtactg cctggacact 

33531 aggtagaaat atgcttaaat tcgaacgtgt 

33601 ttctgtccgc aaaacgccgg ttattgggaa 

33671 tcgttggttt agatttctat atcactactg 

33741 aggtattgca ggttggatat tagaagtaaa 

33811 aataacttcc cgtctgcaca tcaattttta 

338 81 tattcgaagg aaaggtggtt gaataatgat 

33951 caaacaacat cacaatataa tccaattatt 

34021 gtgttttaaa ttttgcagta actaagaata 

34091 tatcgtgtta aaaaccgatg attataacgt 

34161 gacgcaatta atgggcgttt gcagtatgtg 

34231 ctcaggcatt ctttacacaa aacgggagta 

34301 aaatgattta gttagtgggt ttgatggtat 

34371 gaagcagtcg gtaaagactt taaccaatta 

34441 tgaatgatag tgcgacaaaa ggcattcaac 

34511 tgcgacgcaa actagtgcaa cacaagctgt 

34581 atttttgaac gtgttaacga agttgaacaa 

34651 caaattggca aaagtctaaa cttacagatg 

34721 tagcgtttta agcgcagtta acacatctag 

34791 acggatatag gcacgttaga gaagcctgga 

34861 cttatacatc aagcaaatct ggtgtgttag 

34931 gtacccagac gattcaaacg atgagtacac 

35001 aagaatgatg gaaacttaac taagcaattt 

35071 agtatgtaga tgataaattc ggaacaacga 

35141 aattcaagtt aacttaaata atgcgcaagg 

3 5211 agagtgccgg atttaccagg tagtgttgaa 

35281 caaacaagct atttaacttc acgccttata 

35351 acttgagcaa cagtggacag ttcctaatga 

3 5421 gtaggtacaa caatcaatct aaccgaacca 

35491- caggtggcgt tattgaggga ttcggactaa 

3 5561 agttgactca gacggtaacg gtggcggtat 

35631 agaatcgata acgatgtgta ctttgattta 

3 5701 ctataactaa aattatgggg tggaaataat 

35771 gttaatactg gcggtttacg caatagttta 

35841 agttcgaacc tagaaagttc gttttcacta 

3 5911 cgtaccgaat gcatcaaacc aacaaagtgc 

3 5981 agtatgcaaa tgcagatgac gcaagtgaac 

36051 cacaacagtt gaccgaactg aaaactaaca 

36121 ttatccaact tttaaagaca ttaaaacttt 

36191 tacgtagaca tgggtgtaat cgacaaagaa 

36261 aagatgaaaa gtcacaggtg taatgcttga 

36331 gattcaccaa acggcacgaa catgaatggc 

3 6401 cactctcaat gagattaaat taggtcaaaa 

36471 gatgctatcc agagggaaag acagatagac 

36541 tgaaaatgcg gattctcggt ttgataggga 

36611 ttttggtatt taaaggaggt gattaccatg 



ggtcgcacgc aaaaactaat ttcacagaaa tgttaggcaa 
aagaggtggc gcaacaatgg caacagttcc aataggtaaa 
gcagagcaaa taagaggaga cctaatcata ttacaaggca 
caggcgtacc gataggcact gataaaacgg atacaaaaac 
agttattaga aagaataatc cagattcaaa aatactagtg 
ggtacaacaa tacgccgtaa agacacggac aaaaacaaac 
ctcaaatatt agcttgtagt gagttagatg taccagtgtt 
atacaatcca gcttttagga aagcgagcat ggaggacggc 
attatgtacg agttaatcaa ggattattac agtttttacg 
taattacaag tttacattca atgacaggtc ggaaaatagt 
agatgaaggt atgagcaaac ttgagaaaat gtttatatac 
gcgaaacaaa ttaaatactt gaatgacagt gttgaagatt 
atatgattct aggccataac ggcgacggta tcaatgaagt 
tggtcataag acattgcaag atcgtttgta tcatgattat 
gagaaagctg tagatgaaca ctataaagaa tatcgagcga 
aaccggaatt tatcactgat ttatcgccat atacaaatgc 
aacgaaaatt atttatatga cgcaagctcg tccaggtaat 
ggacaattta ttgatagatt gcttgttaaa aacggcggtc 
atggagaatt atggatttat tcagctgtat tggacagtaa 
tagaactgga gaaataactt atggtaatga aatgcaagat 
acgtcagcga tttataatcc tatagaaaat ttaatgattt 
aagctaagaa ttcattgaat ttcattgaag taagaagtgc 
attgtatcaa atggatatac ctatggaata cacttcagat 
gcaggtatct tatattggta tacaggtgat tcgaatacag 
taaaaacaaa agaattgtta tttaaacgac gtatcgatat 
cttccaagaa gctgagggtc tagacatgta ttacgatcta 
gtaactattg gacctggtaa taacagacat cactcaattt 
tcttaaaaaa cattgcacct caagtatcga tgactgattc 
gaacccagca tatctaagtg atattacgga agttggtcat 
gcattagatt tcccgttacc gaaagcgttt agagatgcag 
ataatggtgc tctaagacaa gtacttacca gaaacagcac 
cattgacact ttcaataaga aaaacaacgg agcatggaat 
catatcccta agagtattac aaaattatca gatttaaaaa 
aagaatcaaa acgatttact gattttccta aagactttaa 
atcgaataca ccaggtaaca caacacaagt attaagacgt 
gttagaaact ttggtactgg tggcgttggt aaatggagtt 
agtagataat ttttcgaaag acgataactt aatcgagtta 
gacacaaaca tcagCttcta tgaatcagat agaggaactg 
acagaccgtt atctataagt tctgaacatg ttaaaacatc 
agatagaggc gcttatattt cagacgaatt aacgatagta 
ataccgaatg aatttttaaa acattcaggc aaggtgcatg 
ataatgttgt tgttgaacgt caatttagct tcaatattga 
aacaaagctt gtttatatca aatctattca agatactatc 
aagcaagata tggatgatac acaaacgtta atagcaaaag 
aaatcgaaat caagcaaaac gaagctatac aagctattac 
tacagctgaa gtcgataaaa tagttgaaaa agagcaagcg 
caaatcaatg gcgctgacct tgttaaaggt aattcaacaa 
attacggtaa agcaattgaa tcgtatgagc agtccataga 
gattattcat attactaatg caacagatgc gccagaaaag 
caagatggtg ttgatgacgg ttcttcgttc gatgaatcaa 
ttgcctatgt tgttgataat aatactgctc gtgcaacatg 
aaaatacaaa atctacggca catggtaccc gttttataaa 
gttgaagaaa cgtctaacaa cgctttaaat caagctaagc 
gctggcaaca acataagatg acagaggcga atggtcaatc 
cgatttggga tacttaactg ctggtaatta ctatgcaaca 
agttatgagg gttatttatc ggtattcgtt aaagacgata 
actctaaaaa gatttacaca cgatcaatca caaacggcag 
acataagtca acggtattgt tcgacggtgg agcaaatggt 
tacacaaact actctatttt attagtaagt ggaacttatc 
ccacattacc taatgcaatt caattaagta aagcgaatgt 
ttatgagtgt ttactatcca aaacaagtag cactacttta 
ggtaaaacat caggttctgg agcgaatgcc aacaaagtta 
gaaaatcaca gtaaatgata aaaatgaagt tatcggatac 
gatgtagacg ataacaatgt gtctatcaaa Ctcaaagaag 
acggcgaaac taaatacaat agcaatttcg aaaaagaaga 
gtcagattta agtgatgagg aacttcgcgg aatggttgca 
atgttgacaa tgcaattgac gcaacaaaac gctatgttaa 
aaacaaatac tgagggggac gtttaaatga tgaagatgat 
ttatgtgtgg ggttgctata aaaatgagca aattaagtgg 
gaatatgcac tgatcactgg tgaaaaatat ccagaggcaa 
ggcttcttaa tttaacacaa agtaggtggc gtaatgtttg 
gaattagaag attagaagag aatgataaaa caatgcttag 
aactcaagag caagttaaca ttaaattaga taaaacttta 
gaaaaaaata agaaagaaaa cgacaaaaat atacgcgata 
ctatcttcag tacgattgtc atagctttac taagaactat 
cttaaaggga ttttaggata tagcttctgg gcgtgcttct 
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36681 ggtttggtaa atgtaaataa cagttaagag tcagtgcttc ggcactggct ttttattttg attgaaatga 
36751 ggtgcataca tgggattacc taacccaaag actagaaagc ctacagctag tgaagtggtg gagtgggcaa 
36321 agtcgaatat tggtaagagg attaatatag ataattatcg gggcagtcaa tgttgggata cacctaactt 
36891 tatttttaaa agatattggg gttttgtaac atggggcaat gctaaggata tggctaatta cagatatcct 
36961 aagggtttcc gattctatcg ttattcatct ggatttgtac cggaacctgg agacatcgca gtttggcacc 
37031 ctggcaacgg aataggttcg gacggacaca ccgcaatagt agtaggacca tctaataaaa gttattttta 
37101 tagcgttgac caaaactggg ttaattctaa tagttggaca ggttctccag gaagattagt aagacaccct 
37171 tatgtaagtg ttacaggctt tgttaggcct ccatactcaa aagatactag caaacctagt agtactgata 
37241 caagttcagc atcaaaagcc aatgactcaa caattactgg cgaagcgaag aaaccgcaat ttaaagaagt 
37311 caaaacagta aaatacactg cttacagcaa tgttttagat aaagaagagc acttcattga tcatatagtt 

373 81 gtaatgggtg atgaacgctc agatattcaa ggattatata taaaagaatc aatgcatatg cgttctgtag 

374 51 acgaactgta tacgcaaaga aataagttta taagcgatta tgaaataccg catttatatg tcgatagaga 
3 7521 ggctacatgg cttgctagac caaccaattt tgatgacccg cgtcacccta attggctagt tattgaagta 
37591 tgtggtggtc aaacagatag caaacgacaa ttcttattga atcaaataca agcgttaata cgtggtgttt 
37661 ggttattgtc agggattgat aaaaacttat ctgaaacgac gttaaaggta gaccctaata tttggcgtag 
37731 tatgaaagat ttaattaatt acgacttgat taagcaaggt ataccggata acgcaaagta tgagcaagtt 
37801 aaaaagaaaa tgcttgagac atacatCaaa cgagatatat tgacacgaga aaatataaaa gaagtaacga 
37871 caaaaacaac aataagaatt agtgataaaa catcagttga cagtgcgtcc acacgaggcc ctactccatc 

37 941 agacgaaaaa ccaagcatcg ttactgaaac aagtccattc acattccagc aagcactgga tagacaaatg 

38 011 tctaggggta acccgaaaaa atctcataca tggggctggg ctaatgcaac acgagcacaa acgagctcgg 
38081 caatgaatgt taagcgaata tgggaaagta acacgcaatg ctatcaaatg cttaatttag gcaagtatca 
381S1 aggcatttca gttagtgcgc ttaacaaaat acttaaagga aaaggaacgc tcgacggaca aggcaaagca 
38221 ttcgcggaag cttgtaagaa aaacaacatt aacgaaattt atttgatcgc gcacgctttc ttagaaagtg 
38291 gatacggaac aagtaacttc gctagtggta gatacggtgc atataattac ttcggtattg gtgcattcga 
38361 caacgaccct gattatgcaa tgacgtttgc taaaaataaa ggttggacat ctccagcaaa agcaatcatg 
3 8431 ggcggtgcta gcttcg.taag aaaggattac atcaataaag gtcaaaacac attgtaccga attagatgga 
38501 atcctaagaa tccagctacc caccaatacg ctactgctat agagtggtgc caacatcaag caagtacaat 
38571 cgctaagtta tataaacaaa tcggcttaaa aggtatctac ttcacaaggg ataaatataa ataaagaggt 
38641 gtgtaaatgt acaaaataaa agatgttgaa acgagaataa aaaatgatgg tgttgactta ggtgacattg 
38711 gctgtcgatt ttacactgaa gatgaaaata cagcatctat aagaataggt atcaatgaca aacaaggtcg 
38781 tatcgatcta aaagcacatg gcttaacacc tagattacat ttgtttatgg aagatggctc tatattcaaa 
38851 aatgagcccc ttattatcga cgatgttgta aaagggttcc ttacctacaa aatacctaaa aaggttatca 
3 8921 aacacgctgg ttatgttcgc tgtaagctgt ttttagagaa agaagaagaa aaaatacatg tcgcaaactt 
38991 ttctttcaat atcgttgata gtggtattga atctgctgta gcaaaagaaa tcgatgttaa attggtagat 
39061 gacgctatta cgagaatttt aaaagataac gcgacagatt tattgagcaa agactttaaa gagaaaatag 
39131 ataaagatgt catttcttac atcgaaaaga atgaaagtag atttaaaggt gcgaaaggtg ataaaggcga 
39201 accgggacaa cctggtgcga aaggtgatac aggtaaaaaa ggagaacaag gcgcacccgg taaaaacggt 
39271 actgtagtat caatcaatcc tgacactaaa atgtggcaaa ttgatggtaa agatacagat atcaaagcag 
3 9341 aacctgagtt attggacaaa atcaatatcg caaatgttga agggttagaa gataaattgc aagaagttaa 
39411 aaaaatcaaa gatacaactc tcaacgactc Caaaacgtat acggattcaa aaattgctga actagttgat 
39481 agcgcgcctg aatctatgaa tacattaaga gaattagcag aagcaataca aaacaactct atttcagaaa 
39551 gtgtattgca acagattggc tcaaaagtta gtacagaaga ttttgaggaa ttcaaacaaa cactaaacga 
39621 tttatatgct ccaaaaaatc ataatcatga tgagcggtat gttttgtcat ctcaagcttt tactaaacaa 
3 9691 caagcggata atttatatca actaaaaagc gcatctcaac cgacggttaa aatttggaca ggaacagaaa 
3 9761 atgaatataa ctatatatat caaaaagacc ctaatacact ttacttaatt aaggggtgat ttttatggaa 
3 9831 ggtaatttta aaaatgtaaa gaagtttatt tacgaaggtg aagaatatac aaaagtatat gctggaaata 

3 9901 tccaagtatg gaaaaagcct tcatcttttg taataaaacc cttacctaaa aataaatatc cggatagcat 
39971 agaagaatca acagcaaaat ggacaataaa tggagttgaa cctaataaaa gttatcaggt gacaatagaa 
40041 aatgtacgta gcggtataat gagggtttcg caaactaatt taggttcaag tgatttagga atatcaggag 
40111 tcaatagcgg agttgcaagt aaaaatatca actttagtaa tccttcaggg atgttgtatg tcactataag 
40181 tgatgtttat tcaggatctc caacattgac cattgaataa ttttaaacga ctaatttttt agtcgttttt 

4 0251 tattttggat aaaaggagca aacaaatgga tgcaaaagta ataacaagat acatcgtatt gatcttagca 
40321 ttagtaaatc aattcttagc gaacaaaggt attagcccga ttccagtaga cgatgagact atatcatcaa 
40391 taatacttac tgttgttgct ttatatacta cgtataaaga caatccaaca tctcaagaag gtaaatgggc 
40461 aaatcaaaag ctaaagaaat ataaagctga aaacaagtat agaaaagcaa cagggcaagc gccaattaaa 
40531 gaagtaatga cacctacgaa tatgaacgac acaaatgatt tagggtaggt gttgaccaat gttgataaca 
40601 aaaaaccaag cagaaaaatg gtttgataat tcattaggga agcagttcaa tcctgatttg ttttatggat 
40671 ttcagtgtta cgattacgca aatatgtttt ttatgatagc aacaggcgaa aggttacaag gtttatacgc 
40741 ttataatatt ccatttgata ataaagcaag gattgaaaaa tacgggcaaa taattaaaaa ctatgatagc 
40811 cttttaccgc aaaagttgga tattgtcgtt ttcccgtcaa agtatggtgg cggagctgga catgttgaaa 
40881 ttgttgagag cgcaaattta aacactttca catcatatgg gcaaaattgg aatggtaaag gttggacaaa 
40951 tggcgttgcg caacctggtt ggggtcctga aactgttaca agacatgttc attattacga tgacccaatg 
41021 tattttatta gattaaattt cccagataaa gtaagtgttg gagataaagc taaaagcgtt attaagcaag 
41091 caactgccaa aaagcaagca gtaattaaac ctaaaaaaat tatgcttgta gccggtcatg gttataacga 
41161 tcctggagca gtaggaaacg gaacaaacga acgcgatttt atccgtaaat atataacgcc aaatatcgct 
41231 aagtatttaa gacatgcagg tcatgaagtt gcattatatg gtggctcaag tcaatcacaa gacatgtatc 
41301 aagatactgc atacggtgtt aatgtaggaa ataataaaga ttatggatta tattgggtta aatcacaggg 
413 71 gtatgacatt gttctagaga ttcatttaga cgcagcagga gaaaatgcaa gtggtgggca tgttattatc 
41441 tcaagtcaat tcaatgcgga tactattgat aaaagtatac aagatgttat taaaaataac ttaggacaaa 
41511 taagaggtgt aacaccccgt aatgatttac tgaacgttaa tgtatcagca gaaataaata tcaattatcg 
41581 tttatctgaa ttaggtttta ttactaataa aaaagatatg gattggatta agaagaatta tgacttgtat 
41651 tctaaattaa tagctggtgc gattcatggt aagcctatag gtggtttggt agctggtaat gttaaaacat 
41721 cagctaaaaa ccaaaaaaat ccaccagtgc cagcaggtta tacacttgat aagaataatg tgccttataa 
41791 aaaagagact ggtaattaca cagttgccaa tgttaaaggt aataacgtaa gggacggcta ttcaactaat 
418 61 tcaagaatta caggtgtatt acctaataac gcaacaatca aatatgacgg cgcatattgc atcaatgggt 
41931 atagatggaC tacttatatt gctaatagtg gacaacgtcg ctatattgcg acaggagagg tagataaagc 
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42001 
42071 
42141 
42211 
42281 
42351 
42421 
42491 
42561 
42631 
42701 
42771 
42841 
'42911 
42981 
43051 
43121 
43191 
43261 
43331 
43401 
43471 
43S41 



aggtaatagg 
cattaattat 
ttaacattac 
taatgtaatt 
gaggacttac 
gttgtttttt 
tatgcaaaaa 
ataccagttg 
atgtcagcaa 
tctatatata 
taaacgtgtt 
tttatggaag 
agaaacggga 
aaaacatctt 
acgaagaaga 
aaagaagtat 
atgatagata 
caaaagaaaa 
cgcgtgtcaa 
cgcatagtta 
tggaaacctt 
cacgttgacc 
ataatggcct 



ataagtagtt 
agggaatctt 
tctcaagatt 
acattaccag 
ttgcgtaaag 
atgttatatt 
aaacgaaaaa 
agaggaggat 
ttgccatagc 
aattctaaca 
tttaggcaac 
agggataaaa 
tataaaattg 
tatcagatgc 
taaataaaag 
ttgaatcagg 
cgtagtactt 
ttagtaagtt 
atacgtgtca 
taggcttttc 
gatttaatgg 
ttgctctttt 
aatcttttgc 



ttggtaagtt 
acagttatta 
taaatgtaga 
taaccaatct 
tagtaagaag 
ataaatgatc 
aagttcataa 
aaaaagtgtt 
gaaaacattg 
ctaaaatact 
gatataagta 
atgacagcaa 
ctaaaaattc 
cagatttaga 
gagccaaaaa 
taaaaacttt 
gaccataaaa 
aaataattag 
atttagttct 
agctatatac 
ggttttaatc 
ttatgttcat 
taatatattc 



tagcacgatt 
aataactatt 
taacaggcag 
ggcttaaaac 
ctgactgcat 
aaaccacacc 
aaagtattgc 
agaaaatttt 
aaaaaagacg 
atgaaaacaa 
aaagtgttgt 
taaaagaaat 
cggattacca 
acgataataa 
tatgtttgtt 
ataaaaatta 
aaggcgattt 
aaaaccacgt 
atttctttag 
caagataaga 
tagcaagtgt 
caagtaagtg 
aatagg 



tagtatttac 
tggatggatg 
gtactacggt 
cacatttccg 
atttaaacca 
acctattaat 
atatcacgtt 
aaaactatag 
ataagtaagt 
ttcacattat 
tgcactgctt 
aa^ttgaatca 
tatcaaactg 
agttatacga 
acaaaagaag 
cagatggaag 
gtacccgcaa 
cttaattgac 
ttttctttct 
tttatcccgc 
caaatatgtg 
agagtaggtg 



ttagaataaa 
ttaatattcc 
acttgcctat 
gtagccaatc 
cccatactag 
ttaggagtgt 
taaccgtgtt 
cagaaatcgc 
agacaagccc 
tttaatcatt 
actactttac 
atagaaaagt 
tgcaagattt 
gtatcaaaga 
aatttaaaac 
acatgcaata 
aaagcatacc 
gtggttattt 
aaacttaatt 
cgtctccata 
tcaagaaaat 
tctaaagtta 



aattttgcta 
tatacacttt 
ttttttgtta 
cggctatgca 
ttgctgggtg 
ggttattttt 
ataataaggt 
cttttataca 
gaaagggctg 
cttatttgga 
tgcttatcaa 
tattcgaaaa 
aagaaatgga 
tcgcttgaaa 
tttgaatgta 
tattgggtaa 
caaaatatat 
tttaggtttg 
gcttgtaaac 
aaaatatgct 
aattttctga 
tagatatatt 
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Table 10 



Bacteriophage 96 ORFs list 



SID 


LAN 


FRA 


POS 


a. a. 


RBS sequence 


STA 


STO 


100733 


96ORF001 


1 


25999 . .29142 


1047 


ccttgaatcgaaaggaggttagcct 


ttg 


taa 


100734 


96ORF002 


1 


32008 . . 33906 


632 


tttttacgactaaaggaggcaacca 


atg 


taa 


100735 


96ORF003 


1 


30109. .31995 


628 


t tat at 1 1 tagat aaggagtagect 


atg 


taa 


100736 


96ORF004 


1 


36760. .38634 


624 


attttgattgaaatgaggtgeatae 


atg 


taa 


100737 


96ORF005 


3 


33903. .35729 


608 


gtttattcgaaggaaaggtggttga 


ata 


taa 


100738 


96ORF006 


2 


40589. .42043 


484 


aatgatttagggtaggtgttgacca 


atg 


tag 


100739 


96ORF007 


1 


18652. .20091 


479 . 


tatacacacatactaaacctgaacg 


att 


tga 


100740 


96ORF008 


2 


8960. .10201 


413 


tggcagaatttgggggcgataacga 


atg 


tga 


100741 


96ORF009 


2 


17447. .18670 


407 


gacgcaataacggaagtgatcgtca 


atg 


tga 


100742 


96ORF010 


1 


38647. .39819 


390 


taaatataaataaagaggtgtgtaa 


atg 


tga 


100743 


96ORF011 


-1 


119. .1195 


358 


gtagctcgcctacccttattatttt 


ttg 


tga 


100744 


96ORF012 


2 


20045. .21013 


322 


tttaatgacaaattacctgacatag 


atg 


tga 


100745 


96ORF013 


3 


29157. .30098 


313 


acttattataagggaggtttgttag 


ttg 


taa 


100746 


96ORF014 


1 


21925. .22839 


304 


agaaaataaagtgaggtaataaaat 


atg 


tag 


100747 


96ORF015 


1 


5812. .6591 


259 


at acaeggt aaaggtgggagaat ag 


atg 


taa 


100748 


96ORF016 


1 


7852. .8607 


251 


aat aaaatgt tgaaaggagagaaaa 


atg 


taa 


100749 


96ORF017 


3 


3444 . .4190 


248 


aaatttaacattaatatcactttaa 


gtg 


taa 


100750 


96ORF018 


-3 


28281. .29000 


239 


taagctatgttgaacatcgctagtc 


atg 


tga 


100751 


96ORF019 


3 


7188. .7859 


223 


tttaccgttctaggacgtggtttaa 


atg 


taa 


100752 


96ORF020 


3 


21324. .21908 


194 - 


gaagggcaaaaaggagttttgatat 


atg 


taa 


100753 


96ORF021 


3 


6612. .7175 


187 


attaaaaattaattaaaaggaeggt 


ata 


tag 


100754 


96ORF022 


2 


24536. .25093 


185 


aaagaaaaacgaaggagtgtattaa 


atg 


taa 


100755 


96ORF023 


1 


5275. .5811 


178 


catgaaatggtaggaggtatgaaaa 


gtg 


tag 


100756 


96ORF024 


3 


14481. .15014 


177 


taaaacgataggagataacgaataa 


atg 


taa 


100757 


96ORF025 


2 


25157. .25666 


169 


ataaaaaaattgaaaagaggtatat 


att 


taa 


100758 


96ORF02 6 


-3 


15084 . .15590 


168 


tcattcttaacatagcccttaattc 


atg 


tga 


100759 


96ORF027 


-1 


1229 . . 1732 


167 


aatagcaaataaaggagtgtaaaac 


atg 


taa 


100760 


96ORF028 


1 


16960 . .17454 


164 


aaggcgtgtgatacagtgaaaacaa 


ttg 


taa 


100761 


96ORF029 


-1 


1736 . . 2227 


163 


tatgagaaaaggagtcatataaaag 


atg 


taa' 


100762 


96ORF030 


1 


25531 , .25995 


154 


1 1 1 1 c aagagggagag t cgc t eg t a 


ctg 


tag 


100763 


96ORF031 


2 


23633 . .24097 


154 


tttagtattgaaggtgattctgtag 


ate 


tag 


100764 


96ORF032 


-2 


2248 . . 2706 


152 


ataagacaccaaaggggtttggcgc 


atg 


tga 


100765 


96ORF033 


-3 


39147 . .39605 


152 


agcatataaatcgtttagtgtttgt 


ttg 


taa 


100766 


96ORF034 


2 


13181 . .13615 


144 


tagaagtcgaaaaagtggaggcaat 


ata 


taa 


100767 


96ORF03 5 


2 


10628 . .11053 


141 


gagctaggattgcaagcaacgatat 


ttg 


tga 


100768 


SoORr 0 Jo 


2 


24110 . .24 535 


141 


gtatttttcatagaggtggttaaat 


atg 


taa 


1UU / bJ 


!? oUKr U J / 


l 


12 b a J . . 12 77o 


137 


atgaggaacagaagcaaccaacttt 


att 


tga 


i nnTTn 


y bUKt U JO 


l 


lob 2a . . loQ J 2 


134 


atgttaagaatgatgcctagtttaa 


ttg 


taa 


1UU / /x 


a itod en T Q 


J 


i qqi c a mm 
Jjolo . . 4 U 2 zU 


134 


ctaatacactttacttaattaaggg 


gtg 


taa 


100772 


jOUKf U4U 


J 


2 1 J 4 0 . . i '7Ji 


1 T A \ 

134 


tttccataaataaacgaggacacca 


atg 


tga 


100773 


JOUftf U11 


J 


i CTnc i c c m 


1 JJ 


gatgagggcggaggtgtcagagtag 


atg 


tga 


100774 






J 3 1 £\i . . JtJlUO 


1 2 o 


aagttactataactaaaattatggg 


gtg 


taa 


100775 




_ 2 


"*^71 ^fiflfll 


1 77 
L i. Z. 


ttaaacgt ccccctcagtatttgt t 


ttg 


taa 


100776 


7avji\r u *t *» 




JlOU i . JOiO 


177 
lit 


agtatccatcagt tgaagataatct 


ata 


taa 


100777 






3XJ7 • . JJVi 


1 71 


ccccccccgcac cccgcaacac cca 


a*- *■ 

act 


tga 


100778 


9fiORF04fi 

? o v *t O 


2 


11511 1 1 B77 


110 


aagcaaacgcacagaggcggaacaa 


atg 


*- a a 

caa 


100779 


96ORF047 


2 


22991 . . 23350 


119 


gtegtact aegtctgataagagega 


ata 


taa 


100780 


96ORF048 


3 


8607. .8963 


118 


tggaaaaagaat tgagtgatgac t a 


atg 


tga 


100781 


96ORF049 


1 


23353. .23697 


114 


atccgtttaaaccaataaggtagag 


gtg 


taa 


100782 


96ORF050 


-2 


2728. .3072 


114 


tggtaaattagtattacattaagta 


ata 


taa 


100783 


96ORF051 


3 


4692. .5021 


109 


tcaaaatatacggaggtagtcaact 


atg 


tga 


100784 


96ORF052 


-1 


20882. .21211 


109 


gtagcaaagagacaactaaaaaagt 


gtg 


taa 


100785 


96ORF053 


1 


40252. .40578 


108 


acgactaattttttagtcgtttttt 


att 


tag 


100786 


96ORF054 


1 


4942. .5262 


106 


aatataaaactaaaaaacaaaattt 


atg 


tag 


100787 


96ORF055 


-2 


4840. .5151 


103 


ccgtcgcaatatatagttcgcttaa 


ate 


taa 


100788 


96ORF056 


3 


36324. .36623 


99 


aatttaacacaaagtaggtggcgta 


atg 


taa 


100789 


960RF057 


2 


1394 . .1690 


98 


cttcagtggctcttttagcatttaa 


ata 


taa 


100790 


96ORF0S8 


-3 


26247. .26537 


96 


tacttcttttctcataatctgacca 


att 


tga 


100791 


96ORF059 


-1 


21485. .21772 


95 


agactcaacgcctttttgaacatac 


ttg 


tga 


100792 


96ORF060 


-3 


22647. .22931 


94 


cctctttgtaaccgacaagactgta 


ata 


taa 


100793 


96ORF061 


1 


14023. .14304 


93 


ttatctaattaagggggacgagtga 


gtg 


taa 


100794 


96ORF062 


-2 


38281. .38559 


92 


tatataacttagcgattgtacttgc 


ttg 


taa 
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100795 


96ORF063 


-3 


30786. .31064 


92 


gtctcctaatactacatcttgctta 


gtg 


tga 


100796 


96ORF064 


-2 


30205. .30480 


91 


atgcatctacttttggatgtaatac 


ata 


tag 


100797 


96ORF065 


1 


2617. .2886 


89 


aaggtctaataaaaatttctccttc 


ttg 


taa 


100798 


• 96ORF066 


3 


28056. .28325 


89 


aaggtgtagtcggctggttaactga 


att 


taa 


100799 


. 96ORF067 


-3 


17142. .17411 


89 


ttccgttattgcgtcgtgaagttgt 


ttg 


tga 


100800 


96ORF068 


2 


12326. .12589 


87 


aatgcatgtcgtttggtctgcctaa 


ttg 


tag 


100801 


96ORF069 


2 


42734. .42997 


87 


tttttaggcaacgatataagtaaaa 


gtg 


taa 


100802 


96ORF070 


1 


11869. .12129 


86 


aaatgttcaagaaatggagtgaagc 


ata 


taa 


100803 


96ORF071 


3 


15396. .15656 


86 


aacaagctatacaaattatcgataa 


att 


taa 


100804 


96ORF072 


-3 


37749. .38009 


86 


agattttttcgggttacccctagac 


att 


taa 


100805 


96ORF073 


3 


11244. .11501 


85 


acatgcatatatagaggtggaataa 


atg 


tag 


100806 


96ORF074 


-3 


42936. .43193 


65 


aattatttaacttactaattttctt 


ttg 


taa 


100807 


96ORF075 


-3 


26610. .26867 


85 


tactgccaatgttccatcttcaacc 


att 


taa 


100808 


96ORF076 


-1 


11126. .11380 


84 


tttatctaatacatttaagttaacc 


ate 


taa 


100809 


96ORF077 


-2 


16537. .16791 


64 


tacccaccatataggcaggtagtag 


gtg 


tag 


100810 


96ORF078 


-3 


19521. .19775 


84 


aataactttgaattgatacctcaac 


ata 


tga 


100811 


96ORF079 


3 


13608 . .13859 


83 


t tagggcaaatggaggcagacacaa 


atg 


tag 


100812 


96ORF080 


-3 


28029. .26280 


83 


tgagaagtcgccagtaagcaactga 


att 


tga 


100813 


96ORF081 


3 


20973 . .21221 


82 


aatgaagttatcccattcatgactt 


ate 


tag 


100814 


96ORF082 


-1 


8729. .8974 


81 


cgattattgtgctttcaatttcaaa 


ttg 


tga 


100815 


96ORF083 


-3 


3147. .3392 


81 


tttagcctttatataatcaacttct 


gtg 


tga 


100816 


96ORF084 


3 


1611. .1853 


80 


tgctttatctttagtttctttcttt 


ttg 


tga 


100817 


96ORF085 


-2 


29470 . .29709 


79 


ctcttatcaccttcgtttgtaggca 


ate 


taa 


100818 


96ORF086 


1 


35188 . .35424 


73 


gcgcaaggcgatttgggatatttaa 


ctg 


tag 


100819 


96ORF087 


-2 


13039. .13275 


78 


ttttgattgagctctaaagtgtctt 


att 


tag 


100820 


96ORF038 


3 


24930. .25163 


77 


gaactatcattaaaagttaaatgga 


ata 


tga 


100821 


96ORF089 


-3 


22329. .22562 


77 


tccagtataagatagtggtaatccc 


ata 


taa 


100822 


96ORF090 


-3 


16803 . . 17036 


77 


acctttagtcgaataccctgcgtca 


ata 


tag 


100823 


96ORF091 


-1 


22559. .22789 


76 


aacgcttctggtttaacgttcatgt 


atg 


taa 


100824 


96ORF092 


3 


18360 . . 18587 


75 


attgcaaaagatat tgtaagtagat 


atg 


taa 


100825 


96ORF093 


-2 


2S384 . .25608 


74 


catgatttccttgtaattctctttc 


ate 


taa 


100826 


96ORF094 


1 


10417. .10638 


73 


aacacacattaaggagtgttaaaaa 


atg 


tag 


100827 


96ORF095 


3 


12963 . .13184. 


73 


tactaaacgaagataaaactatgac 


att 


taa 


100828 


96ORF096 


1 


42994. .43212 


72 


gatcgcttgaaaacgaagaagataa 


ata 


taa 


100829 


96ORF097 


-1 


36047. .36265 


72 


tcaagcattacacctgtgacttttc 


ate 


taa 


100830 


96ORF098 


-2 


36766 . .36984 


72 


caggttccggtacaaatccagatga 


ata 


taa 


100831 


96ORF099 


-2 


34765. .34983 


72 


tcattctttttataaaacgggtacc 


atg 


tag 


100832 


96ORF100 


1 


10198. .10413 


71 


acaagaagactcagaggtttttcac 


atg 


taa 


100833 


96ORF101 


1 


15208. .15423 


71 


gagaaacaagttaagataaggagag 


atg 


tga 


100834 


96ORF102 


3 


4209. .4424 


71 


at 1 t t aaaacgaaat at aggagagg 


ctg 


tag 


100835 


96ORF103 


3 


11673. .11888 


71 


catgcaccttatggtatgcgcttag 


ctg 


taa 


100836 


96ORF104 


3 


12117. .12332 


71 


tttacgtccaaagagcttttgactt 


gtg 


taa 


100837 


95ORF105 


3 


23892. .24107 


71 


gatggtgggttatccagtgttataa 


gtg 


taa 


100838 


96ORF106 


-3 


34428 . .34643 


71 


tagacttttgccaatttgttgttga 


att 


taa 


100839 


96ORF107 


-3 


24495. .24710 


71 


ggcacattaccaattgttaatttaa 


atg 


taa 


100840 


96ORF108 


-1 


23876 . .24088 


70 


acatatttaaccacctctatgaaaa 


ata 


taa 


100841 


96ORF109 


-2 


17317. .17529 


70 


acctgtacgctttgctccgtgatta 


att 


taa 


100842 


96ORF110 


-3 


38931. .39143 


70 


actttcattcttttcgatgtaagaa 


atg 


taa 


100843 


960RF111 


-3 


21855. .22067 


70 


agtaaattttttcttttgtgctgtc 


. att 


tga 


100844 


960RF112 


1 


3217. .3426 


69 


aaatgtcaacgggaggtgatacgaa 


atg 


taa 


100845 


960RF113 


-1 


25469. .25678 


69 


tcagggatatatcctaaatatctag 


ctg 


taa 


100846 


960RF114 


-2 


9838. .10047 


59 


ataataatcatcacggtaaagtagc 


ate 


tga 


100847 


960RF115 


1 


13819. .14022 


67 


gcagtaggggttatggcaggtcaag 


ttg 


tga 


100848 


960RF116 


-1 


41033. .41236 


67 


caacttcatgacctgcatgtcttaa 


ata 


taa 


100849 


960RF117 


-3 


24711. .24914 


67 


tctgctgtattccatttaactttta 


atg 


taa 


100850 


960RF118 


-1 


12374. .12574 


66 


tccatctcctctaaaataaagttgg 


ttg 


taa 


100851 


960RF119 


-1 


3980. .4180 


66 


ctcctatatttcgttttaaaatttc 


att 


tga 


100852 


96ORF120 


-3 


6033. .6233 


66 


ttgtaatttagaaatataacgataa 


ata 


taa 


100853 


960RF121 


-2 


37939. .38136 


6S 


ctgaaatgccttgatacttgcctaa 


att 


tga 


100854 


960RF122 


2 


37892. .38086 


64 


acgacaaaaacaacaataagaatta 


gtg 


tga 


100855 


960RF123 


-3 


29193. .29387 


64 


ggacgtctgactttaaatgtgaagc 


ata 


tga 


100856 


960RF124 


1 


4408. .4599 


63 


tttatcggtaccaatttaatgatta 


atg 


taa 


100857 


960RF125 


-1 


7787.. 7978 


63 


ttaaaaatccaagttttgccatcgt 


att 


tga 


100858 


960RF126 


-3 


27027. .27218 


63 


aaatttgaacaacggcattaattga 


gtg 


tga 


100859 


960RF127. 


3 


15051. .15239 


62 


atcgagtcaaggaggttttggggaa 


gtg 


tga 


100860 


960RF128 


-1 


6914 . .7102 


62 


agcgaatgggtttgattgttgactc 


ata 


tga 


100861 


960RF129 


-3 


31332. .31520 


62 


tcttatttgctctgcttgtctataa 


atg 


tga 


100862 


96ORF130 


-3 


30084 . .30272 


62 


gaaatcatcttcaccttcaacatga 


gtg 


taa 


100863 


960RF131 


3 


11058. .11243 


61 


agaaaaagagaaatgaagtgatcta 


atg 


taa 


100864 


960RF132 


-1 


36434. .36619 


61 


taagcatggtaatcacctcctttaa 


. ata 


tga 


10086S 


960RF133 


-1 


35591. .35776 


61 


ctaaactattgcgtaaaccgccagt 


att 


taa 


100866 


960RF134 


-2 


9250. .9435 


61 


atccatgagcttataacccgtctta 


att 


tga 
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100867 


960RF135 


1 


29563. .29745 


60 


cgacaac 1 1 1 1 tgt aggact agt aa 


gtg 


tga 


100868 


960RF136 


-3 


12486. .12668 


60 


cactttacttccaacttgttcagga 


ttg 


taa 


100869 


960RF137 


-1 


14501. .14680 


59 


caaactgaaagctaagtaatcagca 


ate 


tga 


100870 


960RF138 


-2 


23326. .23505 


59 


cttgtgacatttgatgaaattttag 


ttg 


tga 


100871 


960RF139 


-3 


42672. .42851 


59 


aatccggaatttttagcaattttat 


ate 


taa 


100872 


96ORF140 


-3 


31137 . .31316 


59 


acttgattgactagtaaagtcgtac 


atg 


taa 


100873 


960RF141 


-3 


18969. .19148 


59 


aacaaaaat aacat t atagggatct 


ata 


taa 


100874 


960RF142 


-3 


4740. .4919 


59 


cataaattttgttttttagttttat 


att 


tga 


100875 


960RF143 


2 


36107 . .36283 


58 


aacaaatactgagggggacgt ttaa 


atq 


taa 


100876 


960RF144 


3 
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ate 


tga 


101022 


96ORF290 


3 


40917. .41036 


39 


gcaaatttaaacactttcacatcat 


atg 


taa 


101023 


960RF291 


-2 


38815. .38934 


39 


tctctaaaaacagcttacagcgaac 


ata 


taa 


101024 


960RF292 


-2 


32671. .32790 


39 


ctataggattataaatcgctgacgt 


ata 


tga 


101025 


960RF293 


-2 


31216. .31335 


39 


ttgatttgatgtttcttatacttga 


ttg 


taa 


101026 


960RF294 


-2 


21589. .21708 


39 


gtatcttcatcagaatcgcctaaaa 


ate 


taa 


101027 


960RF295 


-2 


18976. .19095 


39 


tatcaatatatgctaacctagcacc 


ata 


taa 


101028 


960RF296 


-2 


11482 . .11601 


39 


gccacctcgtactctttttgcaacc 


att 


taa 


101029 


960RF297 


-3 


12933 . .13052 


39 


tcacgaaataatgtttctttaattt 


ata 


taa 


101030 


960RF298 


-3 


8262. .8381 


39 


gaactgatcttgcttaaatgattta 


att 


tag 


101031 


960RF299 


-3 


6993 . .7112 


39 


cattagcattagcgaatgggtttga 


ttg 


tga 


101032 


96ORF300 


2 


23516 . .23632 


38 


actacatctgaacaactaaaatttc 


ate 


tag 


101033 


96ORF301 


2 


25943 . .26059 


38 


agat tagaagaagaaaaaagaagac 


gtg 


taa 


101034 


96ORF302 


2 


36929 . . 37045 


38 


tattggggttttgtaacatggggca 


atg 


tag 


101035 


96ORF303 


3 


4476. .4592 


38 


ataaaagctacctagtagcagtact 


atg 


tga 


101036 


96ORF304 


3 


20586 . .20702 


38 


tactctaagatagctaaagcaatac 


gtg 


tga 


101037 


96ORF305 


3 


28356 . .28472 


38 


cggttaccaatgtgcttgatacgat 


ttg 


taa 


101038 


96ORF306 


-1 


24359. .24475 


38 


acttaaataaaagccgtatcgtgcc 


atg 


taa 


101039 


96ORF307 


-1 


20147. .20263 


38 


ttgtacctatacgagttaactcctt 


att 


tag 


101040 


96ORF308 


-2 


38158 . .38274 


38 


ttccgtatccactttctaagaaagc 


gtg 


tga 


101041 


96ORF309 


-2 


35149 . .35265 


38 


agcttgtttgtatcgtctttaacga 


ata 


taa 


101042 


96ORF310 


-2 


31423 . .31539 


38 


gtaatatgattaggtctcctcttat 


ttg 


taa 


101043 


960RF311 


-2 


10438 . .10554 


38 


cgcctttaaatcgttttaggtcact 


ate 


taa 


101044 


960RF312 


-2 


1390. .1506 


38 


gagaacaacacaaacattaacaaca 


ate 


taa 


101045 


960RF313 


-3 


33051. .33167 


38 


acgtcctgtttctagatcgtaatac 


ata 


tag 


101046 


960RF314 


-3 


25194. .25310 


38 


agcaaaccgttaaagataacattga 


ate 


taa 


101047 


960RF315 


-3 


6273. .6389 


38 


cattcttgctaacacgtcagattga 


ctg 


tga 


101048 


960RF316 


-3 


4281. .4397 


38 


ataattcgtattcattaatcattaa 


att 


tag 


101049 


960RF317 


1 


2260. .2373 


37 


atgactccttttctcatatttcttt 


ata 


taa 


101050 


960RF318 


2 


21230. .21343 


37 


atttcacacttttttagttgtctct 


ctg 


taa 


101051 


960RF319 


3 


18018. .18131 


37 


atactgagtcaccaatttaagctcg 


atg 


tag 


101052 


96ORF320 


3 


36972. .37085 


37 


attacagatatcctaagggtttccg 


att 


taa 


101053 


960RF321 


-1 


36302. .36415 


37 


ctcttgagttttttgacctaattta 


ate 


taa 


101054 


960RF322 


-1 


32606. .32719 


37 - 


ccataagttatttctccagttctat 


att 


taa 


101055 


960RF323 


-1 


11453. .11566 


37 


ttaaaccgttcttttttatcaattc 


att 


tga 


1010S6 


960RF324 


-1 


7268. .7381 


37 


tactggttcgccccagtgaagttct 


ata 


tga 


101057 


960RF325 


-2 


32347. .32460 


37 


ttactgcatttgtatatggcgataa 


ate 


tag 


101058 


960RF326 


-2 


24682. .24795 


37 


acgtttattacgctcataaagccat 


ata 


tag 


101059 


960RF327 


-2 


23905. .24018 


37 


aaatggctgtggcgcttgaccatat 


gtg 


taa 


101060 


960RF328 


-2 [ 


21460. .21573 


37 


agagcactaatacgtttttgttctt 


ctg 


tga 


101061 


960RF329 


-2 


21208. .21321 


37 


gacttaacttcttcgatattcatat 


ate 


tga 


101062 


96ORF330 


-2 


18085 . .,18198 


37 


ccagtcgacaccagcaaagtattct 


ttg 


tag 


101063 


960RF3 31 


-2 


8170. .8283 


37 


actttgagacgtcgtctgtctctct 


atg 


tag 


101064 


960RF332 


-2 


5971. .6084 


37 


caatttgttttccgttttctcttag 


ttg 


tag 


101065 


960RF333 


-3 


37632. .37745 


37 


accttgcttaatcaagtcgtaatta 


att 


tga 


101066 


960RF334 


-3 


29628. .29741 


37 


ctgagttagtgttgtaaaatgtcat 


ttg 


tag 


101067 


960RF335 


-3 


7164 . .7277 


37 


ttagcggatatccgttttctagtaa 


ate 


taa 


101068 


960RF336 


1 


22903.. 23013 


36 


gtaaaaaaagacaatatgactatta 


ctg 


tga 


101069 


960RF337 


1 


43258. .43368 


36 


taattgacgtggttattttttaggt 


ttg 


taa j 


101070 


960RF338 


2 


12668. .12778 


36 


gaactggtggaatgggcatggaaca 


ate 


tag 


101071 


960RF339 


2 


28292 . .28402 


36 


Ctcactgctttaattcagttgctta 


ctg 


taa 


101072 


96ORF340 ! 


2 


35396. .35506 


36 


ctcctaatgaacataagtcaacggt 


att 


tga 


101073 


960RF341 


3 


25428. .25538 


36 


actcgagaacaattagaaaaagcaa 


ttg 


tga 


101074 


960RF342 


" 1 


40913. .41023 


36 


tatctgggaaatttaatctaataaa 


ata 


tga 


101075 


960RF343 


_1 - 


39173. .39283 


36 


tgccacattttagtgtcaggattga 


ttg 


taa 


1U1U lb 






37580 . .37690 


36 


gggtctacctttaacgtcgtttcag 


ata 


taa 


101077 


960RF345 




31556. .31666 


36 


ggactattctttctaataacctcaa 


ttg 


tga 


101078 


960RF346 




29972. .30082 


36 


ggctactccttatctaaaatataat 


ttg 


taa 


101079 


960RF347 




28787. .28897 


36 


ctgccaaagtctgtagcaattactt 


-ttg 


tga 


101080 


960RF348 




21839. .21949 


36 


ttaaaatccgataaaataacattgc 


ctg 


tga 


101081 


960RF349 




3647. .3757 


36 


taaaacttccgaagttacccagcgt 


ttg 


tga 
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101082 ' 


96ORF3 50 


-2 


40801 . .40911 


36 


accattccaattttgcccatatgat 


gtg 


tag 


101083 


960RF351 


-2 


38953 . .39063 


36 


tatcttttaaaattctcgtaatagc 


ate 


taa 


101084 


960RF352 


-2 


31585. .31695 


36 


tagctgtcatcactagtatttttga 


ate 


taa 


101085 


960RF353 


-2 


24550 . .24660 


36 


atagtccgttttaccgcctcgtact 


att 


tag 


101086 


960RF354 


-2 


20083 . .20193 


36 


atcatcattttgatatttctcaaac 


ata 


tga 


101087 


960RF355 


-2 


991. .1101 . 


36 


gcatcttggcagtacgacgtaaaac 


ate 


tag 


101088 


•960RF356 


-3 


38148 . .38258 


36 


taagaaagcgtgcgcgatcaaataa 


att 


tga 


101089 


960RF357 


-3 


8790 . .8900 


36 


tgaagttatctagcgctatttttct 


ttg 


tag 


101090 


960RF358 


-3 


4458 . .4568 


36 


ttcataaaagtattctttgtagtat 


atg 


tag 


101091 


960RF3S9 


1 


4666 . .4773 


35 


ttatcaaaatatacaacttaattaa 


ate 


tag 


101092 


96ORF360 


1 


11569. .11676 


3S 


ataaatttaccgaacatgaaaatga 


att 


tga 


101093 


960RF361 


2 


6122 . .6229 


35 


ggaaaacaaattgatgttgtagtga 


ttg 


taa 


101094 


960RF362 


-1 


40418 . .40525 


35 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


101095 


960RF363 


-1 


34358 . .34465 


35 


gttttgcttgatttcgatttgttga 


atg 


tga 


101096 


960RF364 


-1 


20654 . . 20761 


35 


ctatttccactgattccccatctaa 


atg 


tga 


101097 


960RF365 


-1 


8423 . .8530 


35 


tcttttttagagttacgaggtttca 


att 


tag 


101098 


960RF366 


-1 


2402 . .2509 


35 


tgacgtatggcaacattttagatca 


ate 


taa 


101099 


960RF367 


-2 


36607. .36714 


35 


aaaataaaaagccagtgccgaagca 


ctg 


tag 


101100 


960RF368 


-2 


27061. .27168 


35 


caaatcgtcctgcagcgttcaataa 


ate 


tag 


101101 


960RF369 


-2 


26470. .26577 


35 


atgagttgttaagtttaccccaaat 


ate 


taa 


101102 


96ORF370 


-2 


10327. .10434 


35 


ccgtgccatcttctcggtataagta 


ata 


taa 


101103 


960RF371 


-2 


8650 . .8757 


35 


gggtacgggttgttactgttgatat 


ate 


taa 


101104 


960RF372 


-3 


14382. .14489 


35 


gttcttttaattgatctactgttaa 


att 


taa 


101105 


960RF373 


-3 


8151. .8258 


35 


atgtttgttagtctctgtgtagtct 


atg 


taa 


101106 


960RF374 


-3 


5007. .5114 


3S 


aaacgatttaagtggaacattattc 


ata 


taa 


101107 


960RF375 


2 


30563. .30667 


34 


cgat t agaaa t c 1 1 t aaaaaaggac 


ttg 


tga 


101108 


960RF376 


-1 


19916 . .20020 


34 


tctatgtcaggtaatttgtcattaa 


att 


taa 


101109 


960RF377 


-1 


9236. .9340 


34 


cttttctgttagtaattgtttttaa 


ate 


taa 


101110 


960RF378 


-1 


9026 . .9130 


34 


actctttatctttagttgcttttaa 


ata 


tag 


101111 


960RF379 


-2 


28447. .28551 


34 


cttttgtgataataaagtttagtgc 


ttg 


tga 


101112 


96ORF380 


-3 


40329. .40433 


34 


ccatttaccttcttgagatgttgga 


ttg. 


tga 


101113 


960RF381 


-3 


39801. .39905 


34 


caaaagatgaaggctttttccatac 


ttg 


taa 


101114 


960RF382 


-3 


33831. .33935 


34 


atgttgtttgtaactcgattaagtt 


ate 


tga 


101115 


960RF383 


-3 


33687. .33791 


34 


gttattacgtcttaatacttgtgtt 


gtg 


tag 


101116 


960RF384 


-3 


13530 . .13634 


34 


tatacgcactagtactgatcactga 


ttg 


taa 


101117 


9SORF38 5 


-3 


3843 . .3947 


34 


tttgattgattgttctagttaagaa 


att 


taa 


101118 


960RF386 


1 


12256. .12357 


33 


agtcataaagaagttagcaatgtga 


ttg 


tag 


101119 


960RF387 


2 


2207. .2308 


33 


tccaagactctttaactgttaactt 


ate 


tag 


101120 


960RF388 


2 


2519: .2620 


33 


attgttgaatttcgattgatctaaa 


atg 


tga 


101121 


960RF389 


2 


22517 . .22618 


33 


agaagtaaaatgcgtaatgctttag 


atg 


tag 


101122 


96ORF390 


2 


27302. .27403 


33 


ttccaaaattgggctaatagtgtag 


ctg 


taa 


101123 


960RF391 


2 


32384 . .32485 


33 


actaaaaaggttgagaaagctgtag 


atg 


taa 


101124 


960RF392 


2 


39287. .39388 


33 


aaaaacggtactgtagtatcaatca 


ate 


tag 


101125 


960RF393 


3 


18153 . .18254 


33 


gtagtatatgccgactttgatttga 


atg 


taa 


101126 | 


960RF394 


3 


24189 . .24290 


33 


tcagaccctaacattaacaaactag 


ttg 


tga 


101127 


960RF395 


-1 


15266 . .15367 


33 


tcgataatttgtatagcttgtttta 


atg 


tag 


101128 


960RF396 


-2 


32239. .32340 


33 


ttttagtgaaagcatctagtgttga 


ata 


tag 


101129 


960RF397 


-2 


16123. .16224 


33 


ttatgtgtgcctatcatattaacaa 


ttg 


tag 


101130 


960RF398 


-2 


13648. .13749 


33 


tctttaactgaatgttgaatagcat 


ttg 


tag 


101131 


960RF399 


-2 


10987. .11088 


33 


acttctgtaggtattcttatatcaa 


ttg 


tga 


101132 


96ORF400 


-2 


3382 . .3483 


33 


cttactggtaattcttcaaaattaa 


atg 


. taa 


101133 


96ORF401 


-3 


40794 . .40895 


33 


ccatatgatgtgaaagtgtttaaat 


ttg 


taa 


101134 


96ORF402 


-3 


39978. .40079 


33 


atattcctaaatcacttgaacctaa 


att 


tga 


101135 


96ORF403 


-3 


38607. .38708 


33 


atcttcagtgtaaaatcgacagcca 


atg 


tag 


101136 


96ORF404 


-3 


21288 . .21389 


33 


cagacaccgtcttaagtccctttag 


ata 


taa 
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SEQUENCE INFORMATION FOR PHAGES MATCHING WITH TABLE 1 

M32695 

Bacteriophage PM2 nuclease cleavage site 
gi|166l45|gb|M32695|BM2NCS [166145] 

(View GenBank report,FASTA report.ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 ' 
gi|166l44|gb|M32693|BM24HIND3 [166144] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, or I , nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|l66144|gb|M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M32694 

Bacteriophage PM2 Hind III fragment 3 
gi!l66l43»gb|M32694|BM23HIND3 [166143] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, or 1 MEDLINE link ) 
M26134 

Bacteriophage PM2 structural protein gene containing purine/pyrimidine rich 
regions and anti-Z-DNA-IgG binding regions, complete cds 
gi|289360|gb|M26 1 34|BM2PROTIV [289360] 

(View GenBank report,FASTA report, ASN.l report,Graphicai view.l MEDLINE link, or 1 protein link ) 
J02452 

bacteriophage fi 3'-terminal region rna 
gi|215409|gb|J02452|PFITR3 [215409] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 MEDLINE link ) 
AF020798 

Bacteriophage Chp I genome DNA, complete sequence 
gi|21776l|dbj|D00624|BCPl [217761] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, 12 protein links, or I genome link ) 
X72793 

Clostridium botirlinum C phage BONT/C1, ANTP-139, ANTP-33, ANTP-17, ANTP-70 
genes and ORF-22 

gi|5l6l71|emb|X72793|CBCBONT [516171] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE link, 6 protein links, or 4 nucleotide neighbors ) 
X51464 

Clostridium botulinum D Phage C3 gene for exoenzyme C3 
gi|14907iemb|X51464|CBDPE3 [14907] 

(View GenBank repo^FASTA report^SN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
D90210 

Bacteriophage c-st (from C. botulinum) Cl-tox gene for botulinum CI neurotoxin 
gi|2l7780|dbj|D902l0|CSTClTOX [217780] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 1 protein link ) 
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S49407 

type D neurotoxin [bacteriophage d-16 phi, host = C. botulinum, type D, CB16, Genomic, 4087 nt] 
gi|260238|gb|S49407|S49407 [260238] 

(View GenBank report.FASTA report.ASN.l report,Graphical view,l MEDLrNE link, or I protein link ) 
X53370 

Bacteriophage phi29 temperature sensitive mutant TS2(98) DNA polymerase gene 
gijl5733|emb|X53370|POTS298 [15733] 

(View GenBank report,FASTA report,ASN.l report, Graphical view J MEDLrNE link, 1 protein link, or 7 nucleotide neighbors ) 
X53371 

Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene 
gi[!573l|emb|X53371|POTS224 (15731] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,l MEDLrNE link, 1 protein link, or 7 nucleotide neighbors ) 
X05973 

Bacteriophage phi29 prohead RNA 
gi|15680|emb|X05973|POP29PRO [15680] 

(View GenBank report,FASTA report,ASN.l report,Graphical vicw,2 MEDLINE links, or 4 nucleotide neighbors ) 
V01155 . 

Left end of bacteriophage phi-29 coding for 15 potential proteins Among 

these are the terminal protein and the proteins encoded by the genes 1, 2 (sus), 3, and (probably) 4 
gi|15659iemb|V01155|POP29B [15659] 

(View GenBank xeport,FAST A report^iSN.l report, Graphical view,l MEDLINE link, 16 protein links, or 16 nucleotide neighbors! 
X73097 

Bacteriophage phi-29 left origin of replication 
gi|3 1 2 1 94[embiX73097|BP29ORlL [3 1 2 194] 

(View GenBank report,FASTA report,ASN. 1 report, Graphical view.l MEDLINE link, or 5 nucleotide neighbors ) 
M14430 

Bacteriophage phi-29 gene- 17 gene, complete cds 
g i[2 1 5 3 2 1 |gb|M 1 4430|P29 G 1 7 A [215321] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 6 protein links, or 8 nucleotide neighbors ) 
M14431 

Bacteriophage phi-29 gene- 16 gene, complete cds 
gi! 2 1 5 3 1 9|gb|M 1 443 1 [P29G 1 6 A [215319] 

(View GenBank rcport,FASTA reportASN.l report,Graphical view, I MEDLINE link, 2 protein links, or 7 nucleotide neighbors ) 
M20693 

Bacteriophage phi-29 DNA, 3' end 
gi|215343|gb|M20693|P29REPINB [215343] 

(View GenBank report,FASTA report^SN.l report,Graphicai view.l MEDLINE link, or 4 nucleotide neighbors ) 
M21016 

Bacteriophage phi-29. DNA, 5' end 

gi|2 1 5342|gb|M2 1 0 1 6|P29REPINA [2 15342] 

(View GenBank report,FASTA report^\SN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
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M12456 

Bacteriophage phi-29 genes 9, 10 and 11 encoding p9 tail, incomplete, plO 
connector, complete, and pi 1 lower collar, incomplete, respectively 
gi|2l5338|gb|M12456|P29P9 [215338] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M 14782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber, 
protein, tail protein, upper collar protein, lower collar protein, pre-ceck* 

appendage protein, morphogenesis(13), lysis, morphogenesis(l5), encapsidation genes, complete cds 
gi|215323|gb|M!47821P29LATE2 [215323] 

(View GenBank report.FASTA report,ASN.l report,GraphicaI view,! MEDLrNE link, 1 1 protein links, or 1 1 nucleotide neighbors. 
M26968 

Bacteriophage phi-29 (from Bacillus subtilis) proteins pi delta-1 genes, complete cds, and the susl(629) mutation 
gi|341558|gb|M26968|P29PlDlA [341558] 

(View GenBank report,FASTA report, ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or I nucleotide neighbor ) 
J02448 

Bacteriophage fl, complete genome 
gi|l66201|gb|J02448|FiCCG [166201] 

(View GenBank report,FASTA report,ASRl report,Graphical view.l MEDLINE link, 10 protein links, 205 nucleotide neighbors, 
or 1 genome link ) 

M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi| l66228|gb|M24832|F2CRNACA [166228] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
J02451 

Bacteriophage fd, strain 478, complete genome 
gi|215394|gb|J024511PFDCG [215394] 

(View GenBank report,FASTA report^SN.l report,Graphical view,5 MEDLINE links, 10 protein links, 204 nucleotide neighbors 
or 1 genome link ) 

M34834 

Bacteriophage fr replicase gene, 5' end 

gi| 1 66 1 39|gb|M34834|BFRREGRA [ 1 661 39] 

(View GenBank report,FASTA report,ASN.l report,Graphicai view.l protein link, or 9 nucleotide neighbors ) 

- M38325 

Bacteriophage fr replicase gene, 5* end 
gi|l66l37igb|M38325|BFRREGR[166137] 

(View GenBank report,FASTA rcportASKI repoit,Graphical view, I protein link, or 9 nucleotide neighbors ) 
M35063 

Bacteriophage fr coat protein replicase cistron (R region) RNA 
gi|166134(gb|M35063|BFRRCRRA [166134] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l protein- link, or 3 nucleotide neighbors ) 
S66567 

alpha-atrial natriuretic factor/coat proteuv*fusion polypeptide [human, 
bacteriophage fr, expression vector pFANIS, PlasmidSyntheticRecombinant, 510 nt] 

S£2SfflJ!!S5^5!SSMN.l ™».< MHDUW m. 1 protein ■* « 13 ririi n tig hbo„ 
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X15031 

Bacteriophage fr RNA genome 
gt|l507l|emb|X15031|LEBFRX [15071] 

(View GeriBank report.FASTA report.ASN.l report.Graphical view,l MEDLINE link, 4 protein links, 9 nucleotide neighbors, 
or 1 genome link ) 

U51233 

Mus musculus neutralizing anti-RNA-bacteriophage fir immunoglobulin variable 
region light chain (IgM) mRNA, partial cds 
g i|l277l50|gb|U5l233|MMU51233 [1277150] 

(View GenBank report,FASTA report, ASN.l report,Graphical view.l protein link, or 1669 nucleotide neighbors ) 
U51232 

Mus musculus neutralizing anti-RNA-bacteriophage fr immunoglobulin variable region heavy chain (IgM) mRNA, partial cds 
gi| 1 277 1 481gb|U5 1 23 2(MMU5 1 232 [1277148] 

(View GenBank report.FASTA report^ASN.l report, Graphic a I view,l protein link, or 1073 nucleotide neighbors ) 
U02303 

Bacteriophage Ifl, complete genome 
gi|3676280|gb|U02303|B2U02303 [3676280] 

(View GenBank rcport,FASTA report, ASN.l report, Graphical view, 10 protein links, or I genome link ) 

V00604 
Phage M 13 genome 

gi|14959|emb|V00604|INM13X [14959] 

(View GenBank report,FASTA report^ASN. I report,Graphical view,l MEDLINE link, 10 protein links, or 205 nucleotide 
neighbors ) 

A32252 

Synthetic bacteriophage M13 protein III probe 
gi|l567340|emb|A32252|A32252 [1567340] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
A32251 

Synthetic bacteriophage M13 protein EI probe 
gi|1567339|emb|A3225l|A32251 [1567339] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 
M12465 

Bacteriophage M13 mplO mutations in lac operon 
gi|2l52l0|gb|Ml2465|Ml3LACMUT [215210] 

(View GenBank report,FASTA rcport^ASN.l report,Graphical view,l MEDLINE link, or 215 nucleotide neighbors ) 
M24177 

Synthetic Bacteriophage M13 (clone M13.SV.B12) SV40 early promoter region DNA 
gi|2094 1 6[gb|M24 177|SYNSVB 1 2 [20941 6] 

(View GenBank report,FASTA report, ASN.l report,Graphical view, 1 MEDLINE link, or I nucleotide neighbor ) 
M24176 

Synthetic Bacteriophage M13 (clone M13.SV.B1 1) SV40 early promoter region DNA 
gi|2094l5|gb|M24176|SYNSVBl 1 [209415] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
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M24175 

Synthetic Bacteriophage M13 (cloae M13.SV.8) SV40 early promoter region DNA 
gi|208806|gb|M24175|SYNMl3SV8 [208806] 

(View GeaBank report.FASTA report, ASN. 1 report,Graphical view, 1 MEDLrNE link, or 242 nucleotide neighbors ) 
iM 19979 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207813|gb|M19979|SYN33M13M [207813] 

(View GenBank rcport.FASTA report,ASN.l report.Graphical view,l MEDLINE link, or 617 nucleotide neighbors ) 
M19565 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207808|gb|M19565|SYN33M13H [207808] 

(View GenBank report,FASTA report,ASN. 1 report.Graphical view.l MEDLINE link, or 567 nucleotide neighbors ) 
M19564 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207807|gb|M19564|SYN33Ml3G [207807] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 12 nucleotide neighbors ) 
M19563 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207806|gb|M19563|SYN33M13F [207806] 

(View GenBank rcport.FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 262 nucleotide neighbors ) 
M19561 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207804|gb|M19561|SYN33Ml3D [207804] 

(View GenBank report,FASTA report^SN.l report, Graphical view.l MEDLINE link, or 27 nucleotide neighbors ) 
M19560 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207803)gb|M19560|SYN33M13C [207803] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, or 1 MEDLINE link) 
M19559 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|2078O2|gb|M19559|SYN33M13B [207802] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 227 nucleotide neighbors ) 
M10568 

Bacteriophage M 1 3 replicarive form II, replication origin, specific nick location 
gi|215220|gb|M!0568|M13ORIB [215220] 

(View GenBank rcport,FASTA rcport^SN.l report.Graphical view,! MEDLINE link, or 650 nucleotide neighbors ) 
M10910 

Bacteriophage M13 gene II regulatory region and M13sjl mutant 
gi|215209|gb|M10910]Ml3IIREG [215209] 

(View GenBank rcport,FASTA rcport^\SN. 1 report,Graphical view.l MEDLINE link, or 72 nucleotide neighbors ) • 
M38295 

Bacteriophage Ml 3 Haein restriction fragment DNA 
gi|215208|gb|M38295|M13HAEHI [215208] 

(View GenBank rcport,FASTA reportrASN.l report,Graphical view, or 67 nucleotide neighbors ) 



5 



E02067 

DNA encoding a part of Bacteriophage M13 tg 127 
gi|217Q31 l|dbj|E02067|E02067 (217031 1] 

(View GenBank report, FAST A report.ASN.l report, or Graphical view) 
J02467 

Bacteriophage MS2, complete genome 
gi|2l5232|gb|J02467|MS2CG [215232] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,8 MEDLrNE links, 4 protein links, 20 nucleotide neighbors, 
or 1 genome link ) 

AJ004950 

Bacteriophage PI ban gene 
gi|3688226]emb|AJOi 1592|BP101 1592 [3688226] 

(View GenBank report.FASTA report.ASN.l report,Graphicai view, or 1 protein link ) 
U88974 

Bacteriophage PI structural lytic transglycosylase (orf47), pep44b (orf44b), 

pep44a (orf44a), and pep43 (orf43) genes, complete cds; and pep42 (orf42) gene, partial cds 

gi|266l099|gb|AFO35607|AF035607 [2661099] 

(View GenBank report,FASTA report^A.SN.1 report,Graphical view,5 protein links, or I nucleotide neighbor ) 

AJ000741 

Bacteriophage P I darA operon 
gi|2462938|emb|AJ00074l|BPAJ764l [2462938] 

(View GenBank report.FASTA report^ASN.l report,Graphical view.l MEDLINE link, 10 protein links, or 3 1 nucleotide neighbor 
X01828 

Bacteriophage PI recombinase gene cin 
gi|15133|emb|XOl828|MYPlCIN [15133] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|1359513|emb|X98146|BP10P880P [1359513] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 nucleotide neighbor ) 
S61175 

imml operon: icd=cell division repressor, antl=antirepressor {promoters 
P5la, P51b} [bacteriophage PI, Genomic, 728 nt] 
gi|385908|gb|S6l!75|S61 175 [385908] 

(View GenBank report,FASTA report^ASN.l report,GraphicaI view.l MEDLINE link; or 3 nucleotide neighbor ) 

X87824 

Bacteriophage PI gene 26 
gi|86H64|emb|X87824|XXBPlG26 [861164] 

(View GenBank report.FASTA report^iSN.l report,Graphical view, or 1 protein link ) 
X15638 

Phage PI DNA for lytic replicon containing promoter P53 and two open reading frames 
gi|15735|emb|Xl5638|PPlLREP [15735] 

(View GenBank report,FASTA reportASN.l report,Graphical view, I MEDLINE link, 3 protein links, or 24 nucleotide neighbor 



X17512 

Bacteriophage PI DNA for immunity region imml 
gi|15479|emb|X175l2|PllMMUN!Y[15479] . 

(View GenBank report.FASTA report, ASN.l report, Graphical view,2 MEDLINE links, or 4 nucleotide neighbors ) 
X16005 

Bacteriophage PI cl gene forPlcl repressor protein 
gi|i5477|emb|X160O5|PlCl [15477] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X03453 

Bacteriophage PI ere gene for recombinase protein 
gi|l5l35|emb|X03453|MYPlCRE [15135] 

(View GenBank report.FASTA report.ASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 12 nucleotide neighbors 
X06561 

Bacteriophage PI cl gene 5'-region 
gi|l5128|emb|X0656l|MYPlCl [15128] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 4 protein links, or 6 nucleotide neighbors ) 
V01 534 '. 

Bacteriophage PI genome fragment (IS2 insertion spot). This regions contains 

four unidentified reading frames and is known as insertion hot spot for IS2 insertion sequences 

gi| 1 5 1 1 8|emb|V0 1 5 34|MYO VP 1 [ 1 5 1 1 8] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 

X56951 

Bacteriophage PI gene 10 
gi|406728|emb|X56951|BPPlGP10 [406728] 

(View GenBank report.FASTA report^SN.l report,GraphicaI view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
IC02380 

Bacteriophage PI replication region including repA, parA, and parB genes and 
incA, incB, and incC incompatibility determinants 
gi|2l5652|gb|K02380|PPlR£P [215652] 

(View GenBank report,FASTA report^\SN.l report,Graphical view,5 MEDLINE links, 4 protein links, or 8 nucleotide neighbors 
X87674 

Bacteriophage PI lydA & lydB genes 
gi|974763|emb|X87674|BACPlLYD [974763] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, I MEDLrNE link; 2 protein links, or 2 nucleotide neighbors 

X87673 
Bacteriophage PI gene 17 
gi|974761|emb|X87673|BACP117 [974761] 

(View GenBank report,FASTA reportASN.l report,GraphicaI view,! MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M16618 

Bacteriophage PI cl repressor binding sites 
gi|215600|gb|Ml66l8|PPlCl [215600] 

(View GenBank rcport,FASTA report^SN.! report,Graphical view,l MEDLINE link, 2 protein links, or 3 nucleotide neighbors 
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SEG_PPtCIN 

Bacteriophage PI cin gene encoding recombinase, cixL recombination site, and 5" end of C invertible element 
gi|2 1 5607|gb||SEG_PP 1 CIN (2 1 5607] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLrNE link, 1 protein link, or 4 nucleotide neighbor, ) 
K03173 

Bacteriophage P I C invertible element, right end, and cixR recombination site 
gi|2l5606|gb|K03173|PPlCIN2 [215606] 

(View GenBank report.FASTA report,ASN.l report, or Graphical view) 
215605 

Bacteriophage PI cin gene encoding recombinase, cixL recombination site, and 5' end of C invertible element 
g ii2l5605|lcl|X0l828 [215605] 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) 
M25470 

Bacteriophage PI tail fiber protein gene, complete cds 
gi|34 1 349|gb|M25470|PP I TFPR [34 1 349] 

(View GenBank rcport,FASTA rcport.ASN.1 report,Graphical view,! MEDLrNE link, 3 protein links, or 3 nucleotide neighbors ) 
M34382 

Bacteriophage PI sim region proteins, complete cds 
gi|215661|gb|M34382|PPlSIM [215661] 

(View GenBank report.FASTA report^SN.! report,Graphical view,l MEDLINE link, or 2 protein links ) 
M81956 

Bacteriophage PI R protein (R) gene, complete cds 
gi|215658|gb|M8l956|PPlRP [215658] 

(View GenBank report,FASTA reportASN.l report,Graphical view,! MEDLINE link, 2 protein links, or 4 nucleotide neighbors ) 
M37080 

Bacteriophage PI mini-Pi plasmid origin of replication 
gi|2l5657|gb|M37080|PPlREPOR [215657] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view,l MEDLINE link, or 46 nucleotide neighbors ) 
M27041 

Bacteriophage PI ref gene, complete cds 
gi|215650|gb|M27041|PP!REF [215650] 

(View GenBank report,FASTA report^SN.l report,Graphical vicw,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 

- L01408 - 
Bacteriophage PI partition protein (parB) gene, 3' end • 
gi|215642|gb|L01408|PPlPARB [215642] 

(View GenBank rcport^ASTA report^ASN.l report,Graphical view.l protein link, or 41 nucleotide neighbors ) 

SEG_PP1PAR 
Bacteriophage miniplasmid PI parA gene, 5' end 
gi|2 i 5639jgb||SEGJPP 1 PAR [2 15639] 

(View GenBank rep6rt,FASTA report^ASN.l rcport.Graphical view.l MEDLINE link, 2 protein links, or 48 nucleotide neighbors 
M36425 

Bacteriophage miniplasmid P 1 parB gene, 3' end 
gi|2l5638|gb|M36425|PPlPAR2 [215638] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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M36424 

Bacteriophage miniplasmid PI parA gene, 5* end 
gi|215637|gb|M36424|PPlPARl [215637] 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) 
Ml 1 129 

Bacteriophage PI miniplasmid origin of replication region 
gi|215632|gb|Mlll29|PPlORIM [215632] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 43 nucleotide neighbors ) 

^. 

M25414 ' 
Bacteriophage P I c I repressor binding site, operator 88 (Op88) 
gij215631|gb|M25414|PPlOP88A [215631] 

(View GenBank report.FASTA report.ASN.l report.Graphical view, I MEDLINE link, or 3 nucleotide neighbors ) 
M25413 

Bacteriophage P 1 c 1 repressor binding site, operator 68 (Op68) 
gi|215630|gb|M25413|PP10P68A [215630] 

(View GenBank report.FASTA report, ASN. I report,Graphical view, or I MEDLINE link ) 
M25412 

Bacteriophage PI cl repressor binding site, operator 21 (Op21) ' 
gi|2l5629|gb|M25412|PP10P2lA [215629] 

(View GenBank report.FASTA report,ASN.l report.Graphical view f l MEDLINE link, or i nucleotide neighbor ) 
M10510 

Bacteriophage PI recombination site loxR 
gi|2 1 5628|gb|Ml 05 10|PP 1 LOXR (2 1 5628] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M 10287 

Bacteriophage P 1 loxP X loxP recombination site 
gi|2 1 5627|gb|M 1 0287|PP 1 LOXPX [2 1 5627] 

(View GenBank rcport,FASTA report^\SN.l report,Graphical view.l MEDLINE link, or 13 nucleotide neighbors ) 
Ml 0494 

Bacteriophage P I recombination site loxP 
gi|215626|gb|M!0494|PPlLOXP [215626] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l MEDLINE link, or 134 nucleotide neighbors ) 
M10511 

Bacteriophage PI recombination site loxL 
gi|2l5625|gb|M1051 1|PPIL0XL [215625] 

(View GenBank report,FASTA report T ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M10512 

Bacteriophage PI recombination site loxB 
gi|2 1 5 624|gb|M 1 05 1 2|PP 1 LOXB [215624] 

(View GenBank rcport,FASTA report^ASN. 1 repor^Graphical view, or 1 MEDLINE link ) 
M10145 

Bacteriophage PI genome fragment with recombination site loxP 

gi|2l5623|gb|Ml0l45|PP!CREX [215623] . 
(View GenBank report,FASTA report,ASNl report,Graphical view.l MEDLINE link, or 21 nucleotide neighbors ) 
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MI3327 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone dSHI326 
gi|2 1 5622|gb|M13327|PPlCN26IV [215622] "«I»HU2fi 
(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLrNE link, or 7 nucleotide neighbors ) 
M13325 

Bacteriophage P 1 Cin recombinase activated cross over site, junction II, clone pSHI326 
gi|2l5621|gb|M13325|PPlCN26II (215621 j <°ne panijzo 

(View GenBank report.FASTA report,ASN.l report,GraphicaI view, 1 MEDLINE link, or 1401 nucleotide neighbors ) 
M13323 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone dSHI325 
gi|215620|gb|M13323|PPlCN25IV [215620] pan»« 

(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13321 

Bacteriophage PI Cin recombinase activated cross over site, junction II, clone pSHI325 
gi|215619|gb|M13321|PPlCN25II [215619] wnepanuo 

(View GenBank report.FASTA report^SN.l report.Graphical view.l MEDLINE link, or 1058 nucleotide neighbors ) 
M13324 

Bacteriophage PI Cin recombinase activated cross over site, junction I, clone oSHI326 
gi|215618|gb|Ml3324|PPlCIR26I [215618] P 

(View GenBank report,FASTA report^SN. 1 report.Graphical view.l MEDLINE .link, or 7 nucleotide neighbors ) 
M13319 

Bacteriophage PI Cin recombinase activated cross over site, right junction clone oSHI327 
gi|215617|gb|M13319|PPlCIN27R(2156l7] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13320 

Bacteriophage PI Cin recombinase activated cross over site, junction I clone pSHI325 
gi|2 1 56 1 6|gb|M 1 3 320|PP 1 CIN25I [2 1 56 1 6] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13318 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHD24 
gi|215615|gb|M13318|PPlCIN24L [215615] 

(View GenBank report,FASTA report^SN.l report.Graphical view.l MEDLINE link, or 1370 nucleotide neighbors ) 
M13317 

Bacteriophage P 1 Cin recombinase activated cross over site, right junction clone pSffl323 
gi|2l5614|gb|M13317|PPlCIN23M[2i5614] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 1055 nucleotide neighbors ) 
M13316 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHD23 
gi|215613|gb|M13316|PPlCIN23L [215613] * 

(View GenBank report,FASTA report^iSN. 1 report,Graphical view. 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13315 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI322 
gi|2156l2|gb|M13315|PPlCIN22R [215612] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
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M13314 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI322 
gi|2l561 l|gb|M13314|PPlCIN22L (21561 1] 

(View GenBank report,FASTA report,ASN.l report.Grapbical view.l MEDLrNE link, or 1401 nucleotide neighbors ) 
M13313 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI321 
gi|215610|gb|M13313|PPlCrN21R [215610] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13312 

Bacteriophage Pi Cin recombinase activated cross over site, left junction, clone pSHI32l 
gi|2 1 5609|gb|M 1 33 1 2|PP 1 CIN2 1 L (2 1 5609] 

(View GenBank report.FASTA report, ASN.l report.Graphical view.l MEDLINE link, or 1058 nucleotide neighbors ) 
Ml 6568 

Bacteriophage P 1 c4 repressor gene, complete cds 
gi|2 1 5603|gb|M 1 65 68|PP 1 C4 [2 i 5603] 
(View GenBank report,FASTA report,ASN.l report,Graphical view, I MEDLINE link, I protein link, or 4 nucleotide neighbors 

M13326 

Bacteriophage P 1 Cin recombinase activated cross over site, junction III, clone pSHI326 
gi|2l5602|gb|M13326|PPlC26in [215602] 

(View GenBank report,FASTA report,ASN.l report, Graphical view.l MEDLINE link, or 1192 nucleotide neighbors ) 
Ml 3322 

Bacteriophage PI Cin recombinase activated cross over site, junction III, clone pSHI325 " 
gi|2l5601|gb|M13322|PPlC25in [215601] 

(View GenBank reportJASTA report^SN.l report,Graphical view.l MEDLINE link, or 1231 nucleotide neighbors ) 
J05651 

Bacteriophage PI modulator protein (bof) gene, complete cds 
,gi|2l5598|gb|J05651|PPlBOFYl [215598] 

(View GenBank report.FASTA report^SN.l report,GraphicaI view.l MEDLINE link, I protein link, or 3 nucleotide neighbors 
M33224 

Bacteriophage PI regulatory protein (bof) gene, complete cds 
gi|215596|gb|M33224|PPlBOFFO [215596] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors 
M10288 

E.col^acteriophage PI loxR recombination site 
gi|l 46647|gb|M 1 0288|ECOLOXR ( 1 46647] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
M10289 

E.coli/bacteriophage PI loxL recombination site 
gi|l46646|gb|M10289|ECOLOXL [146646] 

(View GenBank report,FASTA reporV\SN. 1 report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
Ml 0290 

E.coli loxB site, which can recorabine with bacteriophage PI loxP site 
gi|146645|gb|Ml0290|ECOLOXB [146645] 

(View GenBank report,FASTA report^ASN.l report.Graphical view,l MEDLINE link, or 2 nucleotide neighbors ) 
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M10287 

Bacteriophage PI loxP X loxP recombination site 
gi|2l5627|gb|M!0287|PPlLOXPX [215627] ■ 

(View GenBank report.FASTA report.ASN.l report,Graphical view.i MEDLINE link, or 13 nucleotide neighbors ) 
M74046 

Bacteriophage PI pacA and pacB genes, complete cds 
gi|2l5634|gb|M74046|PPlPACAB [215634] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view,} MEDLINE link, or 2 protein links ) 
M95666 

Bacteriophage PI gene 10, doc and phd genes, complete cds 
gi|463276|gb|M95666|PP 1 PHDDOC [463276] 

(View GenBank report.FASTA report.ASN.l report.Graphical view,2 MEDLINE links, 4 protein links, or 1 nucleotide neighbor ) 
M25604 

Bacteriophage Q-beta mutated autonomously replicating sequence MDV1 RNA fragment 
gi|556359|gb|M25604|PQBARSMUT [556359] 

(View GenBank report.FASTA report, ASN.l report,Graphical view.l MEDLINE link, or 8 nucleotide neighbors ) 
V00643 

first half of the phage Q-beta gene for coat protein 
gi| 1 5088|emb|V00643|LEQBET [15088] 

(View GenBank report,F AST A report,ASN.l report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25167 

Bacteriophage Q-beta RNA fragment recovered from replicase binding complex 
gi|556362|gb|M25 1 67|PQBREPLICB [556362] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24876 

Bacteriophage Q-beta replicase RNA, 5' end 
gi|556360|gb|M24876|PQBREPLICA [556360] 

(View GenBank report.FASTA report^SN.l report,GraphicaI view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25444 

Synthetic bacteriophage Q-bcta DNA 

gi|209 1 1 8|gb|M25444|S YNPQBTERM [209 1 1 8] 

(View GenBank report,FASTA report^ASN.l report,GraphicaI view,l MEDLINE link, or. 8 nucleotide neighbors ) 
M25463 

Bacteriophage Q-beta self-replicating microvariant (+) RNA 
gi|532489|gb|M25463|PQBMVSRRNA [532489] 

(View GenBank reporvFASTA report^ASN.l report.Graphical view, or 1 MEDLINE link ) 
M25014 

Bateriophage Q-beta RNA replicase gene, 5'end, and maturation protein gene, 3' end 
gi|2943 1 6|gb|M250 1 4JPQBREPLC [2943 1 6] 

(View GenBank report^FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
M25065 

Bacteriophage Q-beta RNA sequence with putative stem loop 
gi|2943 1 5|gb|M25065|PQBLOOP [2943 15] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, I MEDLINE link, or 3 nucleotide neighbors) 
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M10265 

Bacteriophage Q-beta RNA molecule with the ability to replicate extracellularly 
gi|215726|gb|M10265|PQBRNA [215726] 

(View GenBank report, FASTA report.ASN.l report.Graphical view, I MEDLrNE link, or 8 nucleotide neighbors ) 
M24815 

Bacteriophage Q-beta specified replicase subunit RNA, 
gi|215725|gb|M24815|PQBREPL [215725] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLrNE link, or 4 nucleotide neighbors ) 

r. 

M25461 

Bacteriophage Q-beta plus-strand RNA, 5' terminus 
gi|2 L5724|gb|M2546 l|PQBPS5E [2 15724] 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) 
M25462 

Bacteriophage Q-beta plus-strand RNA, 3' terminus 
gi|2l5723|gb|M25462|PQBPS3E [215723] 

(View GenBank report.FASTA report^ASN.l report, Graphical view, or 8 nucleotide neighbors ) 

M24871 . 

Bacteriophage Q-bcta nanovariant WSIII RNA 
gi|2 1 5722|gb|M2487 l|PQBNVWSIC [2 15722] 

(View GenBank report,FASTA reporCASN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M24870 

Bacteriophage Q-beta nanovariant WSII RNA 
gi|2l572l|gb|M24870|PQBNVWSIB [215721] 

(View GenBank rcport,FASTA report^ASN.l report, Graphical view,l MEDLrNE link, or 2 nucleotide neighbors ) 
M24869 

Bacteriophage Q-beta nanovariant WSI RNA 
gi|2 1 5720|gb|M24869|PQBNVWSIA [2 1 5720] 

(View GenBank report,FASTA report^ASN.! report,GraphicaI view.l MEDLINE link, or 2 nucleotide neighbors ) 
M10495 

Coliphage Q-beta MDV-l(+) RNA 

gi|2 1 57 1 9|gb|M 1 0495|PQBMD VI A [2 1 57 19] 

(View GenBank report,FASTA report^\SN.l report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
. J02484 

bacteriophage qbeta coat protein cistron first half 
gi|2 1 5 7 1 7|gb| J02484|PQBCP5 [2 1 57 17] 

(View GeoBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M57754 

Bacteriophage Q-beta minus strand RNA, 5* terminus 
gi|2 1 57 1 6|gb|M57754|PQBBMS5E [2 1 571 6] 

(View GenBank repbrt,FASTA report^SN.l rcport,Graphical view, or 8 nucleotide neighbors ) 
M24297 

Bacteriophage Q-beta 5 , -terminal region of the minus strand 
gi|2l57l5|gb|M24297|PQB5END [215715] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 8 nucleotide neighbors ) 
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MI0695 

Bacteriophage Q-beta, MDV-t RNA 
gi|2 1 57 1 4|gb|M 1 0695|PQB IER [215714] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,2 MEDLINE links, or 12 nucleotide neighbors ) 
M24827 

Bacteriophage R17 A protein gene, 5* end 
gi|2l6078|gb|M24827|R17RNACtS [216078] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 5 nucleotide neighbors ) 
M24829 

Bacteriophage R 1 7 coat protein gene, 5' end 
gi|2l6075|gb|M24829|Rl7CP5 [216075] 

(View GenBank rcport.FASTA report.ASN.l report.Graphical view.l MEDLINE link, or 5 nucleotide neighbors ) 
J02488 

bacteriophage rl7 ma synthetase initiation site 
gi|216O80|gb|JO2488|R17RNASYN [216080] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view,3 MEDLINE links, 2 protein links, or 6 nucleotide neighbors ) 
J02487 

bacteriophage r 17 coat protein initiation site 
gi|2l6073|gb|J02487|Rl7COATP [216073] 

(View GenBank report,FASTA report^iSN. 1 report,Graphical view, or I MEDLINE link ) 
J02486 

bacteriophage rl7 a protein initiation site * 
gi|2l607l|gb|J02486|R17APROT [216071] 
1 (View GenBank report.FASTA report^ASN.l report,Graphical view, or 1 MEDLINE link ) 

M24826 

Bacteriophage R17 coat protein RNA fragment 
gi|2 1 607 7|gb|M24826|Rl 7CPRAA [2 1 6077] 

(View GenBank report.FASTA report^SN.l report, Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M24296 

Bacteriophage R17 3'-terminal fragment A RNA 
gi|2I6070|gb|M24296|Rl73TFA [216070] 

(View GenBank report,FASTA report^\SN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
ITFN 

structure refinement for a 24-nucIeotide ma hairpin, nmr, minimized average 

structure ribonucleic acid, hairpin, bacteriophage rl7 moljd: 1; molecule: rl7c; chain: nuU; engineered: yes 
gi|l942336|pdb|lTFN| [1942336] 

(View GenBank reportJASTA report^ASN.l report,Graphical view, or 1 structure link ) 
IRPEA 

rna (5'-d(gpgpgpapcpupgpapcpgpapupcpapcpgp cpapgpupcpupapu-3 1 ) (24-mer ma 
hairpin coat protein binding site for bacteriophage r!7) (nmr. minimized average structure) 
gi|1421020|pdb|lRHT| [1421020] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 stmcture link ) 



M14428 

Bacteriophage S13 circulax DNA, complete genome 
gi|2l6089!gb|Ml4428|Si3CG [216089] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,2 MEDLINE links, 12 protein links, 26 nucleotide neighbors 
or I genome link ) 5 

J05393 

Bacteriophage Tl DNA N-6-adenine-methyltransferase (M.T1) gene, complete cds 
gi|l66l63|gb|J05393|BTlNAMTA [166163] & - 

(View GenBank report.FASTA report, ASN. I report.Graphical view, I MEDLINE link, or 2 protein links ) 
L46845 

Bacteriophage T2 frd3, frd2 genes, complete cds 
gii95l387|gb|L46845|PT2FRD32G [951387] 

(View GenBank repon,FASTA report, ASN. 1 report,Graphical view t 2 protein links, or 17 nucleotide neighbors ) 
L43611 

Bacteriophage T2 fibritin (wac) gene, complete cds 
gi!903869|gb|L436ll|PT2WAC [903869] 

(View GenBank report,FASTA report.ASN.1 report, Graphical view,l protein link, or 4 nucleotide neighbors ) 
M24812 

Bacteriophage T2 secondary structure RNA sequence 
gi|2l5796|gb|M24812|PT2RNA [215796] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 4 nucleotide neighbors ) 
M22342 

Bacteriphage T2 DNA-(adenine-N6)methyItransferase (dam) gene, complete cds 
gi|2 1 5792|gb|M22342|PT2DAM [2 15792] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
S57515 

orf 61.2 {intergenic region between 41 and61} [bacteriophage T2, Genomic, 323 nt] 
gi|298524|gb|S57515|S57515 [298524] 

(View GenBank report.FASTA report^SNl report, Graphical view,l MEDLINE link, or I protein link ) 
X05312 

Bacteriophage T2 gene 38 for receptor recognizing protein 
gi|15l97|emb|X053l2|MYT2G38 [15197] 

(View GenBank rcport,FASTA report,ASN. 1 report,Graphica! view.l MEDLINE link, or 1 protein link ) 
X04442 

Bacteriophage T2 gene 37 for receptor recognizing protein 
gi|l5l95|emb|X04442|MYT2G37 [15195] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
X12460 

Bacteriophage T2 gene 32 mRNA for single-stranded DNA binding protein 
gi| 1 5 1 92|erab|X 1 2460|MYT2G32 [ 1 5 192] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 14 nucleotide neighbor: 
X57797 

Bacteriophage T2 gene for gpl2 
gi|l4875|cmb|X56555|BT2GPl2 [14875] 

(View GenBank report,FASTA report^.SN.1 rcport,Graphical view, I protein link, or 2 nucleotide neighbors ) 
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X01755 

Bacteriophage T2 tail fiber gene 36 
gi|15189|emb|X01755|MYT2F36 [15189] 

(View GenBank report.FASTA report.ASN.1 report.Graphicat view.l MEDLrNE link, 2 protein links, or 1 nucleotide neighbor ) 
M14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis protein and DNA packaging proteins, complete cds 

gi|215810|gb|MU784|PT3RE [215810] . P 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 9 protein links, or 10 nucleotide neighbors 

SEG_PT3RNAPOL 

Bacteriophage T3 RNA polymerase ni gene, 5" end 
gi|710559|gb||SEG_PT3RNAPOL [710559] 

(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors 
M22610 

Bacteriophage T3 RNA polymerase III gene, 3' end 
gi|340722|gb|M22610jPT3RNAPOL2 [340722] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
M22609 

Bacteriophage T3 RNA polymerase III gene, 5' end 
gi|34072l|gb|M22609|PT.3RNAPOLl [340721] 

(View GenBank report,FASTA report.ASN.1 report, or Graphical view) 
X05031 

Bacteriophage T3 gene region 1-2.5 with primary origin of replication 
gi|15719|emb|X0503l|POT3ORI [15719] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view.l MEDLINE link, 11 protein links, or 5 nucleotide neighbors 
X03964 

Bacteriophage T3 early control region pos. 308-810 from genome left end 
gi|15718!emb|X039d4|POT3EP [15718] 

(View GenBank report,FASTA report^iSN.l report,Graphical view,2 MEDLINE links, or 20 nucleotide neighbors ) 
X17255 

Bacteriophage T3 gene 1 to gene 11 
gi|15682|emb|X17255|POT3111G[15682] 

(View GenBank report,FASTA reportASN.l report,Graphical view,4 MEDLINE links, 36 protein links, 17 nucleotide neighbor 
or 1 genome link ) 

X15840 
Phage T3 gene 10 

gi|15625|erab|X15840|PODT3G10 [15625] 

(View GenBank report,F AST A reportASN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
X02981 

Bacteriophage T3 gene 1 for RNA polymerase 
gi|15561|emb|X02981|PODOT3P [15561] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
J02503 

bacteriophage t3 5' end, terminally redundant sequence (trs) 
gi|215816|gb|J02503|PT3TRSl [215816] 

(View GenBank report,F ASTA report^SN. 1 report, r Graphical view) 
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SEG_PT3TRS 

bacteriophage t3 5' cad, terminally redundant sequence (trs) 
gi|2l58l8|gb||SEG_PT3TRS [215818] 

(View GenBank report,FASTA report.ASN. 1 report.Graphical view, or 1 MEDLINE link ) 
J02504 

bacteriophage t3 3' end, terminally redundant sequence (trs) " 
gi|2l5817|gb|J02504|PT3TRS2 [215817] , 

(View GenBank report.FASTA report,ASN. I report, or Graphical view) 

HYPERL^^^Chrrp://^^.rs.noda.sut.acjp/^hiIusawa h t tp://www.rs. noda.sut.ac.jp/-kunisawa 
Bacteriophage T4 genomic database compiled by Arisaka et al. 

X95646 

Bacteriophage T5 DNA for region 60.5%-71% of the T5 genome 
gi|2 79 1 5 57|cmb| A JOO 1 1 9 1 |BTJOO 1191 [2791557] 

(View GenBank report,FASTA report,ASN.l report, Graphical view,7 MEDLINE links, 12 protein links, or 6 nucleotide neighbo 
X56847 

Bacteriophage T5 genomic region encoding early genes D10-D15 
giil5407|emb|Xl2930|MYT5D10[15407] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical vicw,i MEDLINE link, 5 protein links, or 4 nucleotide neighbors 
AF039886 

Bacteriophage T5 subclone T5.5.3r5.l8r, single pass sequence, genomic survey sequence 
gi[28 1 1 154|gb|AF039886|AF039886 [281 1 154] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 
AF039885 

Bacteriophage T5 subclone T5.40f,4 1 f, single pass sequence, genomic survey sequence 
gi|28lll53|gb|AF039885|AF039885 [2811153] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF039884 

Bacteriophage T5 subclone T5.26.fr, single pass sequence, genomic survey sequence 
gi|28 1 1 1 52|gb|AF039884|AF039884 [28 1 1 1 52] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF039883 

Bacteriophage T5 subclone 10-T5.5.7F, single pass sequence, genomic survey sequence 
gi|28U15l|gb|AF039883|AF039883 [2811151] 

(View GenBank report,FASTA report^\SN. 1 report, or Graphical view) 
AF039882 

Bacteriophage T5 subclone 41-T5.5.4BF, single pass sequence, genomic survey sequence 

gi|28 1 1 1 50|gb|AF039882|AF039882 [2811150] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF039881 

Bacteriophage T5 subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence 
gi|281U49|gb|AF039881|AF03988l [2811149] 

(View GenBank report,FASTA report,ASN.i report,Graphical view, or 1 
nucleotide neighbor) 
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AF03988O 

Bacteriophage T5 subclone 19-T5.7.2r, single pass sequence, genomic survey sequence 

gi|28lll48|gb|AFO39880|AF03988O [2811148] ' 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) 

AF039879 

Bacteriophage T5 subclone I8-T5.7.2F, single pass sequence, genomic survey sequence 
gi|28lll47|gb|AF039879|AF039879 [2811147] 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) * 
AF039878 

Bacteriophage T5 subclone 1 1-T5.5.7R, single pass sequence, genomic survey sequence 
gi|28 1 1 l46|gb|AF039878|AF039878 [28 1 1 146] 

(View GenBank report.FASTA report.ASN.l report,Graphical view, or 2 
nucleotide neighbors ) 

AF039877 

Bacteriophage T5 subclone T5.4FR, single pass sequence, genomic survey sequence 
gi|28 1 1 145|gb|AF039877|AF039877 [28 1 1 145] 

(View GenBank report,FASTA report, ASN.l report, or Graphical view) 
AF039876 

Bacteriophage T5 subclone 22-T5.16R, single pass sequence, genomic survey sequence 
gi|281U44|gb|AF039876|AF039876 [2811144] 

(View GenBank report,FASTA rcport^ASN.l report, or Graphical view) 
AF039875 . 

Bacteriophage T5 subclone 21-T5.16R, single pass sequence, genomic survey sequence 
gi|28U143|gb|AF039875|AF039875 [2811143] 

(View GenBank rcport,FASTA report, ASN.l report, or Graphical view) 
AF039874 

Bacteriophage T5 subclone 2 1-T5.16F, single pass sequence, genomic survey sequence 

gi|2811142|gb|AF039874|AF039874 [2811142] ~ 

(View GenBank report,FASTA report.ASN.1 report, or Graphical view) 

AF039873 

Bacteriophage T5 subclone 09-T5.6F, single pass sequence, genomic survey sequence 
gi|281U41|gb|AF039873|AF039873 [2811141] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 
AF039872 

Bacteriophage T5 subclone 09-T5.6R, single pass sequence, genomic survey sequence 
gi|28 1 1 140|gb|AF039872|AF039872 [28 1 1 140] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 2 nucleotide neighbo 
AF039871 

Bacteriophage T5 subclone 04-T5.26.R, single pass sequence, genomic survey sequence 

gi|28H139|gb|AF03987l|AF039871 [2811139] 

(View GenBank rcport,FASTA report^ASN. 1 report, or Graphical view) 

AF039870 

Bacteriophage T5 subclone 13-T5.42F, single pass sequence, genomic survey sequence 

gi|28 1 1 l38|gb|AF03987O|AFO3987O [28 1 1 138] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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X69460 

Bacteriophage T5 ltf gene for L-shaped tail fibers 
gi| 1 54 1 5|emb|X69460|MYT5LTF [15415]. 

(View GeaBank report,FASTA report, ASN.l report,Graphical view,2 MEDLINE links, 1 protein link, or 4 nucleotide neighbors j 
X034O2 

Bacteriophage T5 D15 gene for 5' exonuclease 
gi| 1 54 l3|emb|X03402|MYT5EXOG [ 154 13] 

(View GenBank report.FASTA report.ASN.l report,Graphical view.l MIDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
211972 

Bacteriophage T5 tRNA-Tyr, tRNA-Glu, tRNA-Trp, tRNA-Phe, tRNA-Cys and 
tRNA-Asn genes, and ORFs 91aa, 90aa, 42aa and 172aa 
gi|l5795|emb|Zl 1972|T56TRNAG [15795] 

(View GenBank report.FASTA report.ASN.l report, Graphical view,l MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 
X03898 

. Bacteriophage T5 genes for tRNA-His, -Ser and -Leu 
gi|l5786|emb|X03898|STT5RNl [15786] 

(View GenBank report.FASTA report^SN. 1 report,Graphical view, or 2 MEDLINE links ) 
X04177 

Bacteriophage T5 gene for transfer RNA-Gln 
gi|l5421!emb|X04177|MYT5TRNQ [15421] 

(View GenBank report,FASTA reportASN.l report, Graphical view,l MEDLINE link, or 2 nucleotide neighbors ) 
X03899 

Bacteriophage T5 genes for tRNA-Val, -Lys, -fMet, -Pro and -Ile3 
gill5787|emb|X03899|STT5RN2[l5787] 

(View GenBank report,FASTA report,ASN.l report, Graphical view, or 1 MEDLINE link ) 
X03798 

Bacteriophage T5 gene for tRNA-Asp (GUC) 
gi! 1 5472|emb|X03798|NCT5TRDG [ 1 5472] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
Y00364 

Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) 
gi|l5420|emb|Y00364|MYT5TRN [15420] 

(View GcoBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 13 nucleotide neighbors ) 
X03140 

Bacteriophage T5 DNA with rho-dependent transcription terminator (Hind DJ-P fragment) 
gi|l5417|emb|X03140|MYT5RHO [15417] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 

Z35070 
Bacteriophage T6 DNA 

gi|535228|emb|Z35074|MYER£GBT6 [535228] 

(View GenBank report.FASTA report^SN.l rep ort,Graphicai view, 1 MEDLINE link, or I protein link) 
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AF060870 

Coliphagc T6 small subunit distal tail fiber (gene 36) gene, partial cds; and large subunit distal tail fiber (gene 37) and tail fiber 
adhesin (gene 38) genes, complete cds 
gi|3676458|gb|AF052605|AF052605 [3676458] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical vicw,3 protein links, or 2 nucleotide neighbors ) 
Z35072 

Bacteriophage T6 DNA encoding ORF19.1 gene and gl9 gene 
gi|535232|ernb|Z35072|MYTAILT6 [535232] 

(View GenBank report.FASTA report, ASN.l report,Graphical view, I MEDLINE link, or 2 protein links ) 
X12488 

Bacteriophage T6 gene 32 mRNA for single-stranded DNA binding protein 
gijl5843iemb|Xl2488|MYT6G32 [15843] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 14 nucleotide neighbors ) 
Z78095 

Bacteriophage T6 DNA (1506 bp) 
gi|1488562|emb|Z78095|BPHZ78O95 [1488562] 

(View GenBank report,FASTA report, ASN. 1 report,Graphical view, I protein link, or 4 nucleotide neighbors ) 
Z35079 

Bacteriophage T6 DNA for Ip5, Ip6 

gi|53 52 1 5|emb|Z35079|MY57BT6 [5352 1 5] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or I nucleotide neighbor ) 
X68725 

E.coli bacteriophage T6 gene for beU-glucosyl-HMC-alpha-glucosyl-transferase 
gi|296439|emb|X68725|ECT6 [296439] ■ 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
X69894 

Bacteriophage T6 alt gene for ADP-Ribosyltransferase 
gi|l5422|emb|X69894|MYT6ADP [15422] 

(View GenBank report,FASTA rcport^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or I nucleotide neighbor ) 
L46846 

Bacteriophage T6 frd3, frd2 genes, complete cds 
gi|95 1 390|gb|L46846|PT6FRD32G [95 1390] 

(View GenBank report,FASTA report^ASN. 1 report, Graphical view, or 2 protein links ) 

M27738 v 
Bacteriophage T6 translational repressor protein (regA), complete cds 
gi|2 1 59931gb|M27738|PT6REGA [2 1 5993] 

(View GenBank report^AStA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 5 nucleotide neighbors ) 
M38465 

Bacteriophage T6 DNA ligase gene, complete cds 
gi|21599l|gb|M38465|PT6LIG55 [215991] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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VOl 146 
Genome of bacteriophage T7 
gi|43 1 l87|emb|V01 1 46|T7CG [43 1 187] 

(View GenBank report,FASTA report.ASN.l report,Graphical view, 13 MEDLINE links, 60 protein links, 105 nucleotide 
neighbors, or I genome link ) 

X60322 

Bacteriophage alpha3 genes A, B, K, C, D, E, J, F, G, H 
gi|l4775|emb|X60322|BACALPHA [14775] 

(View GenBank report,FASTA report,ASN.l report,Graphicat view.l MEDLINE link, 10 protein links, 22 nucleotide neighbors, 
or 1 genome link ) 

X13332 

Bacteriophage alpha3 DNA for origin of replication 
gi|l5093|emb|X13332|MlA3ORPL [15093] 

(View GenBank report,FASTA report t ASN. 1 report.Graphical view, or 1 MEDLINE link ) 
X126U 

Bacteriophage alpha3 gene for protein A part., finger domain 
gi| 1 5092|emb|X 1 26 1 1 |MIA3 AFIN [ 1 5092] 

(View GenBank report,FASTA reporttASN. 1 re port, Graphical view.l MEDLINE link, I protein link, or 6 nucleotide neighbors ) 
X15721 

Bacteriophage alpha3 deletion mutation DNA for the origin region (-ori) of replication 
gi|l4774|emb|X1572l|BA3DMOR9 [14774] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 11 nucleodde neighbors ) 
X15720 

Bacteriophage alpha3 deletion mutant DNA for the origin region (-ori) of replication 
gi|14773|emb|Xi5720|BA3DMOR8 [14773] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
X15719 

Bacteriophage alpha3 insertion mutant DNA for the origin region (-ori) of replication 
gi|l4772|emb|X157i9|BA3DMOR7 [14772] 

(View GenBank report.FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
X15718 

Bacteriophage alpha3 deletion mutation DNA for origin region (-ori) of replication 
gi| 1477 1 |emb|X 1 57 1 8|BA3DMOR6 [14771] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view, I MEDLINE link, or 1 1 nucleotide neighbors ) 
X15717 

Bacteriophage alpha3 deletion mutatnt DNA for origin region (-ori) of replication 
gi|14770|emb|X15717|BA3DMOR5 [14770] 

(View GenBank report,FASTA rcport^ASN. 1 report,Graphical view, I MEDLINE link, or 9 nucleotide neighbors ) 

X15716 . ~ 

Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication 
gi|l4769|emb|X15716|BA3DMOR4 [14769] 

(View GenBank reportJASTA reporttASN. 1 report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
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J02459 

Bacteriophage lambda, complete genome 
gi|2l5104|gb|J02459|LAMCG [215104] 

(View GenBank report.FASTA report, ASN.lTcport.Graphical view,87 MEDLINE links, 67 protein links, 190 nucleotide 
neighbors, or I genome link ) 

J02482 

Bacteriophage phi-X 1 74, complete genome 
gi|2160l9|gb|J02482|PXlCG [216019] 

(View GenBank report.FASTA report,ASN.l report, Graphical view,2iTvIEDLINE links, 1 1 protein links, 26 nucleotide neighbors 
or 1 genome link ) k ° 

J02454 

Bacteriophage G4, complete genome 
gi|215415|gb|J02454|PG4CG [215415] 

(View GenBank report.FASTA report.ASN. 1 report.Graphical view,6 MEDLINE links, 1 1 protein links, 20 nucleotide neiohbors 
or I genome link ) 

X60323 

Bacteriophage phiK complete genome 
gi|I4781I8|emb|X60323|BPHIKCG [1478118] 

(View GenBank report,FASTA report^ASN.l report,GraphicaI view.10 protein links, 18 nucleotide neighbors, or 1 genome link ) 
L42820 

Bacteriophage BF23 tail protein Qui) gene, complete cds 
gi|1048680|gb|L42820|BBFHRS [1048680] 

(View GenBank report,FASTA report^\SN.i report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X54455 

Bacteriophage BF23 gene 17 and gene 18 
gi|l4797|emb|X54455|BF2317l8G [14797] 

(View GenBank report,FASTA report^SN.! report,Graphical view,2 protein links, or 2 nucleotide neighbors ) 
M37097 

Bacteriophage BF23 DNA, right end of terminal repetition 
gi| 1 66 1 1 5|gb|M37097|BBFRIGH ( 1 66 1 1 5] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M37096 

Bacteriophage BF23 DNA, left end of terminal repetition 
gi|166114|gb|M37096|BBFLEFT [166114] 

(View GenBank reportJASTA report^SN.i rcport,GraphicaI view, I MEDLINE link; or 1 nucleotide neighbor ) 
M37095 

Bacteriophage BF23 A2-A3 gene, complete cds, and Al gene, 5' end 
gi|166110|gb|M37095|BBFA2A3 [166110] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
AF056281 

Bacteriophage BF23 clone bD3.mac5/6.1, genomic survey sequence 

gi|3090930|gb|AF05628 l|AF056281 [3090930] 

(View GenBank report,FASTA repot^ASN.l report, or Graphical view) 
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AF056280 

Bacteriophage BF23 clone bf23.mac3, genomic survey sequence 

gii3090929|gb|AF056280| AF056280 [3090929] 

(View GenBank report.FASTA report,ASN. 1 report, or Graphical view) 

AF056279 

Bacteriophage BF23 clone bf23.macl8/2l.34, genomic survey sequence 
gi!3090928|gb|AF056279|AF056279 [3090928] 

(View GenBank report.FASTA report.ASN.1 report, or Graphical vicwf 
AF056278 

Bacteriophage BF23 clone bf23,macl6/19.33, genomic survey sequence 
gij3090927|gb|AF056278|AF056278 [3090927] 

(View GenBank report.FASTA report.ASRl report, or Graphical view) 
AF056277 

Bacteriophage BF23 clone bf23.mac 16/ 19-33, genomic survey sequence 
gii3090926|gb|AF056277|AF056277 [3090926] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056276 

Bacteriophage BF23 clone bf23.mac 12/9-9, genomic survey sequence 
gi|3090925|gb|Af 056276|AF056276 [3090925] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056275 

Bacteriophage BF23 clone bf23.mac 1 1/14-24, genomic survey sequence 
gi|3090924|gb|AF056275|AF056275 [3090924] 

(View GenBank report.FASTA report^iSN.l report, or Graphical view) 
AF056274 

Bacteriophage BF23 clone bf23.57r64r, genomic survey sequence 
gi|3090923|gb|AF056274|AF056274 [3090923] 

(View GenBank report.FASTA report^.SN.1 report,Graphical view, or 3 nucleotide neighbors ) 
AF056273 

Bacteriophage BF23 clone bf23.54fr, genomic survey sequence 

gi|3090922|gb|AF056273|AF056273 [3090922] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF056272 

Bacteriophage BF23 clone bf23.47fr.mac 10/7, genomic survey sequence 

gi|3090921|gb|AF056272|AF056272 [3090921] 

(View GenBank reportJASTA reportASN.l report, or Graphical view) 

AF056271 

Bacteriophage BF23 clone bf23.23.66r, genomic survey sequence 

gi|3090920|gb|AF0562.7i|AF056271 [3090920] 

(View GenBank report.FASTA reportASN.l report, or Graphical view) 

AF056270 

Bacteriophage BF23 clone bf23.23.64f, genomic survey sequence 

gi|3090919|gb|AF056270|AF056270 [3090919] 

(View GenBank report.FASTA report^iSN.l report, or Graphical view) 
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AF056269 

Bacteriophage BF23 clone bf23.23.60r, genomic survey sequence 
gi|30909 1 8|gb| AF056269|AF056269 [30909 1 8] 

(View GenBank report.FASTA report,ASN. 1 report, or Graphical view) 
AF056268 

Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence 
gi|30909l7|gb|AF056268|AF056268 [3090917] 

(View GenBank report,FASTA report,ASN. 1 report, Graphical view, or 1 nucleotide neighbor ) 
AF056267 

Bacteriophage BF23 clone bf23.23.59r, genomic survey sequence 
gi|30909l6|gb|AF056267|AF056267 [3090916] 

(View GenBank report,FASTA report.ASN.l report, or Graphical view) 
AF056266 

Bacteriophage BF23 clone bf23.23.59f, genomic survey sequence 
gi|3090915|gb|AF056266|AF056266 [3090915] 

(View GenBank report,FASTA report,ASN.i report, or Graphical view) 
AF056265 

Bacteriophage BF23 clone bf23.23.56r, genomic survey sequence 
gii3090914|gb|AF056265|AF056265 [3090914] 

(View GenBank rcport,FASTA report^SN. 1 report, or Graphical view) 
AF056264 

Bacteriophage BF23 clone bf23.23.56f, genomic survey sequence 
gi|3090913|gb|AF056264|AF056264 [3090913] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
AF056263 

Bacteriophage BF23 clone bf23.23.68f55r, genomic survey sequence 
gi|30909l2|gb|AF056263|AF056263 [3090912] 

(View GenBank report,FASTA repot^ASN.l report, or Graphical view) 
AF056262 . 

Bacteriophage BF23 clone bf23.23.43fr.66f t genomic survey sequence 
gi|309091l|gb|AF056262|AF056262 [3090911] 

(View GenBank rcport,FASTA report^ASN.i report, or Graphical view) 
AF056261 

Bacteriophage BF23 clone bf23.23.2fir, genomic survey sequence 
gi|30909!0|gb|AF056261|AF05626l [3090910] 

(View GenBank report,FASTA report>ASN,l report, or Graphical view) 
AF05626O 

Bacteriophage BF23 clone bf23.23.55.f, genomic survey sequence 
gi|3090909|gb(AF056260|AF056260 [3090909] 

(View GenBank report,FASTA repor^ASN.l report, or Graphical view) 
AF056259 

Bacteriophage BF23 clone bf23.23.53.r, genomic survey sequence 

gi|3090908|gb|AF056259|AF056259 [3090908] 

(View GenBank report,FASTA repor^ASN.l report, or Graphical view) 



25 



AF056258 

Bacteriophage BF23 clone bf23. 23.53. f, genomic survey sequence 
gi|3090907|gb|AF056258|AF056258 [3090907] 

(View GenBank report.FASTA report, ASN.l report, or Graphical view) 
AF056257 

Bacteriophage BF23 clone bf23.23.52.r, genomic survey sequence 
gi!3090906|gb|AF056257|AF056257 [3090906] 

(View GenBank report.FASTA report,ASN.l report, or Graphical view)* 
AF056256 

Bacteriophage BF23 clone bf23.23.52.f, genomic survey sequence 
gi|3090905|gb|AF056256|AF056256 [3090905] 

(View GenBank report.FASTA report.ASN.l report, or Graphical view) 
AF056255 

Bacteriophage BF23 clone bf23.23.49.r, genomic survey sequence 
gi|3090904|gb|AF056255|AF056255 [3090904] 

(View GenBank report.FASTA report,ASN.l report, or Graphical view) 
AF056254 

Bacteriophage BF23 clone bf23.23.49.f, genomic survey sequence 
gi|3090903|gb|AF056254|AF056254 [3090903] 

(View GenBank report,FASTA report^ASN. 1 report, or Graphical view) 
AF056253 

Bacteriophage BF23 clone bf23.23.48.r, genomic survey sequence 
gi|3090902|gb|AF056253|AF056253 [3090902] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 
AF056252 

Bacteriophage BF23 clone bf23.23.48.f, genomic survey sequence 
gi|309090l|gb|AF056252|AF056252 [3090901] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 
AF056251 

Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence 

gi|3090900|gb|AF056251|AF056251 [3090900] 

(View GenBank reportJASTA report^ASN.l report, or Graphical view) 

AF056250 

Bacteriophage BF23 clone bf23.23.41.f, genomic survey sequence 

gi|3090899|gb|AF056250|AF056250 [3090899] 

(View GenBank report^ASTA report^ASN.i report, or Graphical view) 

AF056249 

Bacteriophage BF23 clone bf23.23.22.ax, genomic survey sequence 

gi|3090898|gb|AF056249|AF056249 [3090898] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF056248 

Bacteriophage BF23 clone bC3.23.22.a.f, genomic survey sequence 

gi|3090897|gb|AF056248|AF056248 [3090897] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 
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AF056247 

Bacteriophage BF23 clone b03.23.68.r, genomic survey sequence 

gi|3090896|gb|AF056247|AF056247 [3090896] 

(View GenBank report,FASTA report,ASN. I report, or Graphical view) 

Z50114 

Bacteriophage BF23 DNA for putative tail protein gene 
gi|2464952|emb|2501 14|BF23LATE [2464952] 

(View GenBank report,FASTA report,ASN.l report,Grapbical view, « 1 protein link ) 

D12824 ' 

Bacteriophage BF23 genes for minor tail protein gp24 and major tail protein gp25, complete cds 
gi|520578|dbj|D12824|BBF2TAIL [520578] W P 

(View GenBank report.FASTA report,ASN. 1 report.Graphical view.l MEDLrNE link. 2 protein links, or 3 nucleotide neighbors ) 
Z34953 

Bacteriophage K3 ip9, ip7 and ip8 genes 
gi|535261|emb|Z34953|MYK31P978 [535261] 

(View GenBank report,FASTA report.ASN.l report,Graphical view.l MEDLINE link. 3 protein links, or 1 nucleotide neighbor ) 
Z35075 

Bacteriophage K3 DNA for Ip3 and Ip4 
gi|535229|emb|Z35075|MYEORF64K [535229] 

(View GenBank report,FASTA report^SN. 1 report.Graphical view.l MEDLINE link, or 2 protein links ) 
X0556O 

Bacteriophage K3 gene 38 for receptor recognizing protein 
gi|15112|emb|X05560|MYK3G38 [15112] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 1 protein link) 
X04747 

Bacteriophage K3 gene 37 for receptor recognizing protein 
gi|15110|emb|X04747|MYK3G37 [15110] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
X01754 

Bacteriophage K3 tail fiber gene 36 

gi| 1 5 1 08|emb|X0 1 754|MYK3F36 [ 1 5 108] 

(View GenBank report,FASTA report, ASN. 1 report,Graphical view. 1 MEDLINE link, or 2 protein links ) 
M16812 

Bacteriophage K3 Y lysis gene, complete cds 
gi|2 1 5 503 |gb|M 1 68 1 2 |PK3L YST [2 1 5503] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
L46833 

Bacteriophage K3 frd3, frd2 genes, complete cds 
gi|95l377|gb|L46833|PK3FRD32G[951377] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view,2 protein links, or 2 nucleotide neighbors ) 
L43613 

Bacteriophage K3 fibritin (wac) gene, complete cds 
gi|903861|gb|L43613|PK3WAC [903861] 

(View GenBank report,FASTA report,ASN.l report,Grapbical view.l protein link, or 4 nucleotide neighbors ) 
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X01753 

Bacteriophage 0x2 tail fiber gene 36 
gi|l5l22|emb|X01753|MYOX2F36 [15122] 

(View GenBank report.FASTA report.ASN. 1 report,Gra P hical view, 1 MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
L43612 

Bacteriophage 0x2 fibritin (wac) gene, complete cds 
gi|903848|gb|L43612|OX2WAC [903848] 

(View GenBank report.FASTA report,ASN.l report,Graphical viewj protein link, or 4 nucleotide neighbors ) 

Z46880 
Bacteriophage 0X2 stp gene 
gi|599663|emb|Z46880|BPOX2STP [599663] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view.l MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
X05675 

Bacteriophage Ox2 gene 38 for receptor-recognizing protein and flanking regions 
gi| 15 l24|emb|X05675|iMYOX2G38 [15124] 

(View GenBank report.FASTA report.ASN.l report,Graphical view.l MEDLINE link. 3 protein links, or 1 nucleotide ne.ghbor ) 
M33533 

Bacteriophage RB 1 8 translational repressor protein (regA) and Orf43.1. complete cds 
gi|216083|gb|M33533|RB18REGA[216083] ^ 

(View GenBank report.FASTA repott,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033329 

Bacteriophage RB 1 8 single-stranded binding protein (gene 32) gene, partial cds. and 5' region 
gi|2645788|gb|AFO33329|AF033329 [2645788] 

(View GenBank report,FASTA report^SN.I report.Graphical view.l protein link, or 1 1 nucleotide neighbors ) 
M86231 

Bacteriophage RB 69 gene 62, 3"end; RegA (regA) gene, complete cds 
gi|215354|gb!M8623l|P6962REGA [215354] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE link. 2 protein links, or 1 nucleotide neighbor ) 
AF033332 

Bacteriophage RB69 single-stranded binding protein (gene 32) gene, partial cds. and 5' region 
gi|2645794|gb|AF033332|AF033'332 [2645794] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l protein link, or 12 nucleotide neighbors ) 

U34036 - 
Bacteriophage RB69 DNA polymerase (43) gene, complete cds 
gi|1237125|gb|U34036|BRU34036 [1237125] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
V01 145 

Bacteriophage H 1 genome fragment Each Thymine given in this sequence represents a HMU-residue 
(HMU = 5-hydroxymethyluracil) 
gi|J5557|emb|V01 145|PODOHl [15557] 

(View GenBank report,FASTA report^iSN. 1 report,Graphical view, or 1 MEDLINE link ) 
X05676 

Bacteriophage M I gene 38 for receptor recognizing protein and flanking regions 
gi|15H4|emb|X05676|MYMIG38 [15114] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
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AF034575 

B -S25S g « ^a???™! 5?*"* (int) gCae ' COmplete Cds ' and attP re S ion . complete sequence 
gi|2662472|gb|AF034575|AF034575 [2662472] 

(View GenBank report.FASTA report,ASN. 1 report.GraphicaI view. 1 MEDLINE link, or 1 protein link ) 
AF033321 

Bacteriophage M I single-straaded binding protein (gene 32) gene, partial cds. and 5 1 region 

gi|2645772|gb|AF033321|AF033321 [2645772] 8 

(View GeoBank repon.FASTA report.ASN.1 report,Graphical view.4 protein link, or 17 nucleotide neighbors ) 

X55190 

SiSI&«^ 7 ^"fi35 P--.37 and 38 (respectively), partial cds 

(View GenBank report.FASTA report.ASN. 1 rc P ort,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033334 

Bacteriophage Tulb single-stranded binding protein (gene 32) gene, partial cds and 5' reeion 
gi|2645798|gb|AF033334|AF033334 [2645798] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view, or 5 nucleotide neighbors ) 
X55191 

Bacteriophage Tulb 37 gene for receptor-recognizing protein 37 (partial cds), 38 gene for receptor-recognizing protein 38 
and t gene (partial cds) 
gi|14863|emb|X55191|BFTUIB [14863] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
X13065 

Bacteriophage phi80 early region 
gi|14800|emb|Xl3065|BP80ER [14800] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLINE link, 8 protein links, or 6 nucleotide neighbors ) 

D0O36O 
Bacteriophage phi80 cor gene 
gi|2 1 7782|dbj|D00360|P8080COR [2 1 7782] 

(View GenBank report,FASTA reporUSN.l report,Graphical view, or 1 protein link ) 
X01639 

Bacteriophage phi 80 DNA-fragment with replication origin 
gi|l5828|emb|X01639|XXPHI80 [15828] 

(View GenBank report,FASTA reporvASN.l report,Graphical view.l MEDLINE link, or 25 nucleotide neighbors ) 
X04051 

Lambdoid bacteriophage phi 80 int-xis region (integrase-excisionase region) 
gi|15770|emb|X04051|STPHI80X [15770] 

(View GenBank report,FASTA reporVASN.l report,GrapbicaI view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
X06751 

Phage Phi80 DNA for major coat protein 
gi|15768|cmb|X06751|STPHI80C [15768] 

(View GenBank report,FASTA report^A.SN.1 report,Graphical view.l MEDLINE link, 1 protein link, or 1 1 nucleodde neighbors ) 
X75949 

Bacteriophage pbi80 DNA for ORF xl71.8 and ORF X171.28' 
gi|4588 1 1 |emb|X75949|ECORF17 IB [4588 1 1] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 28 nucleotide neighbors 
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L40418 

Bacteriophage phi-80 gene, complete cds 
gi|l019107|gb|L40418|P80A [1019107] 

(Vievy GenBank report, FAST A report.ASN.l report.Graphical view.l MEDLINE link, or 1 protein link ) 
M24831 

Bac teriophage phi-80 Tyr-tRNA gene, 3* end 
gi|215363|gb!M2483l|P80TGY [215363] 

(View GeaBank report.FASTA report.ASN.l report,Graphical view.l^ ^EDLINE link, or 43 nucleotide neighbors ) 
M10670 

Bacteriophage phi-80 replication origin 
gi|2 1 536l|gb|M 10670|P80ORI (2 1 5361] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M24825 

Bac teriophage phi-80 RNA fragment 
gi|215360|gb|M24825|P80M3 A [215360] 

(View GenBank report.FASTA report,ASN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
Ml 1919 

Bacteriophage phi-80 cl immunity region encoding the N gene 
gi|215358|gbjMl 1919|P80CI [215358] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
M10891 

Bacteriophage phi-80 attP site DNA 
gi|215357|gb|M10891|P80ATTP [215357] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
Ml 9473 

Bacteriophage 933J (from E.coli) proviral Shiga-like toxin type 1 subunits A and B genes, complete cds 
gi|2 1 5072|gb|M 1 9473|J93SLTI [2 15072] 

(View GenBank report.FASTA report^SN.l report,Graphical view,2 MEDLINE links. 2 protein links, or 20 nucleotide neighbors 
Y10775 

Bacteriophage 933W ileX, stx2A and stx2B genes 
gi|1938206|emb|Y10775|BP933ILEX [1938206] 

(View GenBank report.FASTA report^iSN.l report,Graphical view.2 protein links, or 36 nucleotide neighbors ) 
X83722 

Bacteriophage 933W slt-HB gene 
gi|1490229|emb|X83722|B933WSLT [1490229] 

(View GenBank rcport,FASTA reporCASN.l report,Grapbical view,2 protein links, or 20 nucleotide neighbors ) 
X07865 

Bacteriophage 933W slt-II gene for Shiga-like toxin typell subunit A and B 
gi|14892|emb|X07865|BWSLTn [14892] 

(View GenBank report,FASTA report^ASN.1 report,Graphical view,2 protein links, or 29 nucleotide neighbors ) 
Ml 6625 

Bacteriophage H19B (from Rcoli) sltlA and sltlB genes encoding Shiga-like toxin I subunits A and B, complete cds 
gi|2 1 5043|gb|M 1 6625|H 19BSLT [2 1 5043] 

(View GenBank report,FASTA report^iSN.l report,Graphical view.l MEDLINE link, 2 protein links, or 24 nucleotide neighbors 
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M17358 

Bacteriophage H19B shiga-like toxin- 1 (SLT-1) A and B subunit DNA, complete cds 
gi|215046|gb|Ml7358|H19BSLTA [215046] 

(View GeaBank report.FASTA report, ASN. I report.Graphical view.l MEDLINE link, 2 protein links, or 20 nucleotide neighbors ) 
U29728 

Bacteriophage N4 single -stranded DNA -binding protein (N4SSB) gene, complete cds . 
gi|939708|gb|U29728|BNU29728 [939708] 

(View GenBank report.FASTA report,ASN.l report, Graphical view,2 MEDLINE links, or 1 protein link ) 
J02580 

Bacteriophage PA-2 (E.coli porcine strain isolate) Rz gene, 5'end; ORF2, outer membrane porin protein (lc) and ORF1 genes, 
complete cds 

gi|2 1 5366|gb|J02580|PA2LC [215366] 

(View GenBank report.FASTA report, ASN.l report,Graphical view.l MEDLINE link, 4 protein links, or 4 nucleotide neighbors ) 
U32222 

Bacteriophage 186, complete sequence 
gi|3337249|gb|U32222|BlU32222 [3337249] 

(View GenBank report.FASTA report^SN.l report.Graphical view, 6 MEDLINE links, 46 protein links, or 5 nucleotide neighbors ) 
X5 1522 

Bacteriophage P4 complete DNA genome 
gi|4509 1 6|emb|X5 1 522|MYP4CG [4509 16] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,3 MEDLINE links, 13 protein links, 6 nucleotide neighbors, 
or I genome link ) 

X92588 

Bacteriophage 82 orf33, orfl51, orf56, orf96, rus, orf45, and Q genes 
gi|l05 1 1 1 l|emb|X92588|BAC82HOLL [105 11 i 1] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,7 protein links, or 1 nucleotide neighbor ) 
J02803 

Bacteriophage 82 antitermination protein (Q) gene, complete cds 
gi|2l5364|gb|J02803|P82Q [215364] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINElink, or 1 protein link ) 
U02466 

Bacteriophage HK022 (cro), (ell) and (O) genes, complete cds, (?) gene, partial cds 
gi|407285|gb|U02466|BHU02466 [407285] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 5 protein links, or I nucleotide neighbor ) 
M26291 

Bacteriophage D108 regulatory DNA-binding protein (ner) gene, complete cds 
gi|166194|gb|M2629l|D18NERtl66194] 

(View GenBank reportJASTA report y ASN.l report,Graphical view t l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
Ml 1272 

Bacteriophage D108 left-end DNA 
gi|l66193|gb|M11272|Dl8LEDNA [166193] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 2 nucleotide neighbors ) 
Ml 8902 

Bacteriophage D108 kil gene encoding a replication protein, 3* end; and containing three ORFs, complete cds 
gi|l66l9l|gb|Ml89021Dl8KIL [166191] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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M10191 

Bacteriophage D108, left end with Mu A protein binding sites LI and L2 
gi|!66190|gb|M10l91|Dl8BSL [166190] 

(Vie* GetiBank report.FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 5 nucleotide neighbors ) 
J02447 

bacteriophage d 108 gene a 5' end 
gi|166l89|gb|J02447|D18AAA [166189] 

( View GenBank report.FASTA report,ASN. 1 report,Graphical view, or* 1 MEDLINE link ) 
V00865 

Bacteriophage D 1 08 fragment from genes A and ner (C-terminus of ner and N-terminus of A) 
gi|15437|emb|V00865|NCD108 [15437] 

(View GenBank report.FASTA report,ASN.l repon.Graphical view, I MEDLINE lurk, or 2 protein links ) 
X01914 

Bacteriophage IKe gene for DNA binding protein 
gi| 149S7|emb|X0l914|INTKEDBP [14957] 

(View GenBank report,FASTA reportASN.t report,Graphical view.l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 

AF064539 
Bacteriophage N 15, complete genome 
gi|3 192683|gb|AF064539|AF064539 [3 192683] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE links, 60 protein links, 26 nucleotide neighbors 
or 1 genome link ) a 

U023O3 

Bacteriophage Ifl, complete genome 
gi|3676280|gb|U02303|B2UO23O3 [3676280] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view, 10 protein links, or 1 genome link ) 
AF007792 

Bacteriophage Mu late morphogenetic region 
gi|3551775|gb|AF007792|AF007792 [3551775] 

(View GenBank report.FASTA report^SN. 1 report,Graphical view, or 1 nucleotide neighbor ) 

U24159 ' 
Bacteriophage HP 1 strain HP 1 c 1 , complete genome 
gi|1046235|gb|U24159|BHU24159 [1046235] 

(View GenBank report,FASTA report^ASN.l report,Grapbical view,6 MEDLINE links, 41 protein links, 8 nucleotide neighbors, 
or 1 genome link ) 

Z71579 

Bacteriophage S2 type A 5.6 kb DNA fragment 
gi|1679806|emb|Z71579|BPHSlADNA [1679806] 

(View GenBank report,FASTA reporv\SN.l report,Graphical view.3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors ) 
X53238 

Klebsiella sp. bacteriophage Kl 1 gene 1 for RNA polymerase 
gi|14984|emb|X53238|KSKHRPO [14984] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
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X85010 

Bacteriophage A5 1 1 ply5 1 1 gene 
gi|853748|emb|X85010|BPA51 1PLY [853748] 

(View GenBank report.FASTA report.ASN.l report,Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gt|939708|gb|U29728|BNU29728 [939708] P 

(View GenBank report.FASTA report,ASN. 1 report.Graphical view.2 MEDLINE links, or 1 protein link ) 
J02445 

bacteriophage bo 1 3'-terminal region raa 
gi|166152|gb|J02445|BOlTR3 [166152] 

(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLINE link, or 5 nucleoride neighbors ) 
L06I83 

Bacteriophage L5 (from Leuconostoc oenos) genome 
gi|289353|gb|L06183|BL5GENM [289353] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 genome link ) 
AF074945 

Mycoplasma arthritidis bacteriophage MAV1. complete genome 
gi|35U243|gb|AF074945|AF074945 [3511243] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view.15 protein links, 3 nucleotide neighbors, or 1 genome link ) 
LI 3696 

Bacteriophage L2 (from Mycoplasma), complete genome 
gi|28933 8|gb|L 1 3696|BL2CG [289338] 

(View GenBank report.FASTA repor^ASN. 1 report,Graphical view.3 MEDLINE links, 14 protein links, or 1 genome link ) 
X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase proteins 
gi|5 1 7237|emb|X80 1 9 1|BPP7PR [5 1 7237] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 4 protein links, or 1 genome link ) 
M19377 

Bacteriophage Pf3 from Pseudomonas aeruginosa (New York strain), complete genome 
gi|2 1 5380|gb|M 1 9377|PF3COMNY [2 1 5380] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link. 9 protein links, or 5 nucleoride neighbors ) 
M11912 

Bacteriophage Pf3 from Pseudomonas aeruginosa (Nijmegen strain), complete genome 
gi|2l5371|gb|M11912|PF3COMN [215371] • 

(View GenBank reportJASTA report^SN.l report,Graphical view.l MEDLINE link. 9 protein links, 5 nucleodde neighbors or 1 
genome link) * ' 

V00605 

Bacteriophage Pfl gene encoding DNA binding protein 
gi|14970|emb|V00605|INOPFl [14970] 

(View GenBank report,FASTA report,ASN.l report,Graphkal view.l proteine link, or 1 nucleotide neighbor ) 
L05626 

Bacteriophage PR4 capsid protein (P6) gene, complete cds 
gi|2l5735|gb|L05626|PR4P6MAJA [215735] 

(View GenBank report,FASTA report^\SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleoride neighbor ) 
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D 13409 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosR, attP int eenes 
gi|217776|dbj|D13409|BPHCOSR [217776] .™g«« 

(View GenBank report.FASTA report.ASN. I report.Graphical view, 1 MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
D13408 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosL ctx eenes 
gi|217775|dbj|D13408|BPHCOSLCTX [217775] ' 

(View GenBank report.FASTA report.ASN. 1 report,Graphical view,2 MEDLINE links, or 3 nucleotide neighbors ) 
M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi|166228|gb|M24832|F2CRNACA [166228] 

(View GenBank report.FASTA report.ASN.l report.Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide ne lg hbors ) 
S72011 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes.partial cds 
gi|26 1 8967|gb|AF0 1 7629| AFO 1 7629 (26 1 8967] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017628 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8964|gb|AF0 1 7628| AFO 1 7628 [26 1 8964] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINElink. 2 protein links, or 44 nucleotide neighbors ) 
AFO 17627 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes partial cds 
gi|26 1 896 1 |gb| AFO 1 7627| AFO 1 7627 (26 1 896 1] 

(VieW GenBank report.FASTA report.ASN.1 report,Graphical view.l MEDLINElink. 2 protein links, or 44 nucleotide neighbors ) 
AFO 17626 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|26 1 8958|gb| AFO 1 7626|AF0 1 7626 [26 1 8958] 

(View GenBank report,FASTA report^SN.! report,Graphical view.! MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
AFO 17625 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi| 26 1 895 5|gb| AFO 1 762 5| AFO 1 7625 [26 1 8955] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 

AFO 17624 - 
Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int)genes, partial cds 
gi|26 1 8952|gb|AF0 1 7624|AF01 7624 [26 1 8952] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINElink. 2 protein links, or 44 nucleotide neighbors ) 
AFO 17623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618949|gb|AF017623|AF017623 [2618949] ~ " " 

(View GenBank report,FASTA reporV\.SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AFO 17622 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8946|gb|AF0 1 7622|AF0 1 7622 [26 1 8946] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
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AF017621 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8943|gb| AFO i 762 1 1 AFO 1 762 1 (26 1 8943] 

(View GenBank report.FASTA report,ASN.i report,Graphical view f i MEDLINE link, 2 protein links, or 44 nucleotide neighbors ] 
D26449 

Bacteriophage PS 1 7 FI gene for tail sheath protein (gpFI) and FII gene for tail tube protein (epFII), complete cds 
gi|452l62|dbj|D26449|BPSFIFn [452162] 

(View GenBank report,FASTA report^SN.l report,Graphical .view, or 2 protein links ) 
X87627 

Bacteriophage D31 12 A and B genes 
gi|974768|emb|X87627|BPD3112AB [974768] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,l MEDLINElink, 2 protein links, or 1 nucleotide neighbor ) 
U32623 

Bacteriophage D3 transcriptional activator CII (ell) gene, complete cds 
gi|984852|gb|U32623|BDU32623 [984852] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.i protein link, or I nucleotide neighbor ) 
L34781 

Bacteriophage phi 1 1 holin homologue (ORF3) gene, complete cds and peptidoglycan hydrolase (lytA) gene, partial cds 
gi|5 1 1 838|gb|L3478 1 |BPHHOLIN [5 1 1 838] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
L14810 

Bacteriophage P22 (gplO) gene, complete cds, and (gp26) gene, complete cds 
gi|294053|gb|L14810|P22GP1026X [294053] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
X87420 

Bacteriophage ES 1 8 genes 24, c2, cro, c 1, 1 8, and oL and oR operators 
gi|l 143407Iemb|X87420|BPES18GEN [1143407] 

(View GenBank report,FASTA report,ASNl report,Graphical view,5 protein links, or 9 nucleotide neighbors ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi|l048680|gblL42820|BBFHRS [1048680] 

(View GenBank report,FASTA rcport^ASN.l report,Graphical view.l MEDLINElink, 1 protein link, or 1 nucleotide neighbor ) 
X14980 

Bacteriophage PRD1 XV gene for protein P15 (lytic enzyme) 
gi|15802|emb|X14980|TEPRDlXV [15802] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINElink, 1 protein link, or 4 nucleotide neighbors ) 
X06321 

Bacteriophage PRD1 gene 8 for DNA terminal protein 
gi|15800|emb|X0632i|TEPRD18 [15800] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 10 nucleotide neighbors ) 
X14336 

Filamentous Bacteriophage 12-2 genome 
gi|l4920|emb|X14336|INBI22 [14920] 

(View GenBank report,FASTA report,ASN.i report,Graphical view.l MEDLINE link, 9 protein links, 1 nucleotide neighbor, or I 
genome link) 
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L05001 

Bacteriophage X glucosyl transferase gene, complete cds 
gi|2 1 6044|gb|L0500 1 IPXFCLUS YLT [2 1 6044] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLrNE link, or I protein link ) 
M29479 

Bacteriophage p4 sid and psu genes partial cds. and delta gene, complete cds ei|215701l 
gb|M29479|PP4SDP (215701] V e«u M /u/ui| 

(View GenBank re P ort,FASTA report,ASN. 1 report,Graphical view,3 protein links, or 4 nucleotide neighbors ) 

SEG_PP4PSUSID 
Bacteriophage P4 capsid size determination protein (sid) gene 5' end 
gi|2 1 5698|gb||SEG_PP4PSUSID (2 1 5698] 

(View GenBank report.FASTA report,ASN.l report.Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M29650 

Bacteriophage P4 polarity suppression protein (psu) gene, complete cds 
gi|2 1 5697|gb|M29650|PP4PSUSID2 [2 1 5697] 

( View GenBank report.FASTA report,ASN. 1 report, or Graphical view) 
M29651 

Bacteriophage P4 capsid size determination protein (sid) gene 5' end 
gi|215696|gb|M2965l|PP4PSUSIDl [215696] 

(View GenBank report,FASTA report^iSN.l report, or Graphical view) 
M27748 

Bacteriophage P4 gop, beta, and ell genes, complete cds and int gene 3* end 
gi|21569l|gb|M27748|PP4GOPBC (215691] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link. 4 protein links, or I nucleotide neighbor ) 
K02750 

Bacteriophage IKe, complete genome 
gi|21506l|gb|K02750|IKECG (215061] 

(View GenBank report,FASTA report^SN.l report.Graphical view.l MEDLINElink, 10 protein links, 4 nucleotide neighbors or 1 
genome link ) 

L40418 

Bacteriophage phi-80 gene, complete cds 
gi| 1 0 1 9 1 07|gb|L404 18|P80A [1019107] 

(View GenBank report,FASTA report^.SN. 1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
AF032122 

Bacteriophage SHI integrase (int) gene, partial cds; and bactoprenol glucosyl transferase (bgt), and glucosyl tranferase II (etrll) 
genes.complete cds ■ 
gi|2465412|gb|AF021347|Af 021347 [2465412] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINElink, 4 protein links, or 2 nucleotide neighbors ) 
M35825 

Bacteriophage SF6 fragment D lysozyme gene, complete cds 
gi|216105|gb|M35825jSF6LYZ [216105] 

(View GenBank repott,FASTA report^.SN.1 report,Grapbical view, or I protein link ) 

Z35479 
Bacteriophage CI 6 ipl gene 
gi|534936|emb|Z35479|BC16IPl [534936] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, I protein link, or 2 nucleodde neighbors ) 
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X12638 

Bacteriophage 21 DNA for gene 2 
gi!29614t|emb|X12638|B2lGENE2 [296141] 

(View GenBank report,FASTA report.ASN.l report.Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X02501 

Bacteriophage 2 1 DNA for left end sequence with genes 1 and 2 
gi!15825|emb|X02501|XXPHA2l [15825] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M65239 

Bacteriophage 21 lysis genes S, R, and Rz, complete cds 
gi!2 1 5466|gb|M65239|PH2LYSGEN [2 15466] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M58702 

Bacteriophage 21 late gene regulatory region 
gi|2 1 5465|gb|M58702|PH2LATEGE [2 1 5465] 

(View GenBank report.FASTA report.ASN.1 report,Graphical view, or 1 MEDLINE link ) 
M81255 

Bacteriophage 21 head gene operon 
gi|2l5454|gb|M81255|PH2HEADTL [215454] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view,2 MEDLINE links, 10 protein links, or 4 nucleotide neighboi 
M23775 

Bacteriophage 2 1 glycoprotein 1 gene, complete cds, and glycoprotein gene, 5' end 
gi|215451|gb|M23775|PH2GPA [215451] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors 
M61865 

Bacteriophage 21 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds 
gi|215448|gblM61865|PH22XISAA [215448] 

(View GenBank report,FASTA report^.SN.1 report,Graphical view,2 protein links, or 9 nucleotide neighbors ) 
S720U 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618967|gb|AF017629|AF017629 [2618967] 

(View GenBank report,FASTA report\ASN.l report,Grapbical view.l MEDLINE link. 2 protein links, or 44 nucleotide neighbor? 
AF017628 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8964|gb|AF0 1 7628|AF0 17628 [26 18964] 

(View GenBank report,FASTA reporiASN.l report,Grapbical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbor 
AF017627 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618961|gb|AF0l7627|AF017627 [2618961] 

(View GenBank report,FASTA reporVASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbor 
AF017626 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|2618958|gb|AF017626|AF017626 [2618958] 

(View GenBank report,FASTA repor^ASN.1 report,Graphical view.l MEDLINE link, 2 protein links, or 49 nucleodde neighboi 
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AF017625 

B -SS£S J u li '^T deh yd"g»«e ( icd > integrase (in,) genes, partial cds 
gi|26 1 8955|gb| AFO 1 7625|AF0 1 7625 [26 1 8955] 

(View GeaBank report.FASTA report.ASN.l report.Graphical view,. MEDLINE link, 2 protein links, or 44 nucleotide neighbor: 
AF017624 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 

gi|261 895 2|gb|AF017624|AF0 17624 [2618952] " 

(View GenBank report,FASTA report,ASN.l report,Graphical view.! MEDLINE link, 2 protein links, or 44 nucleotide neighbors 
AF017623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1894 9|gb| AFO 1 7623| AFO 1 7623 (26 1 8949] 

(View GenBank report.FASTA report.ASN. 1 report.Graphical view. 1 MEDLINE link. 2 protein links, or 44 nucleotide neighbors 
AF017622 

8 . oo^f l C 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618946|gb|AF017622|AF0l7622 [2618946] 

(View GenBank report.FASTA report^SN.l report.Graphicat view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors 
AF017621 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 18943|gb|AF0 1 7621|AF01762 1 (261 8943] 

(View GenBank report.FASTA report,ASN. I report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors 
M57455 

Bacteriophage 42D (clone pDB17) (from Staphylococcus aureus) staphylokinase gene, complete cds 
gi|215344|gb|M57455|P42STK [215344] ' gen . complete cos 

(View GenBank report,FASTA reporWSN. 1 report.Graphical view. 1 protein link, or 9 nucleotide neighbors ) 
Y12633 

Bacteriophage 85 DNA, promoter sequence of unknown gene 

gi|2058285|emb|Y12633|B85PROM [2058285] 

(View GenBank report.FASTA report^.SN. 1 report, or Graphical view) 

X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|1359513|emb|X98146|BPlOP880P [1359513] 

(View GenBank report.FASTA report^SN.l report,GraphicaI view, or 1 nucleotide neighbor ) 
Y07739 

Staphylococcus phage Twort holTW. plyTW genes 
gi|2764979|emb|Y07739|BPTWGHOLG [2764979] 

(View GenBank report,FASTA report^iSN.l report,G'raphical view, or 2 protein links) 
L07580 

Bacteriophage phi-1 1 rinA and rin B genes, required for the activation of Staphylococcal phage phi-1 1 int expression 
gi|166160|gb|L07580|BPHRINAB [166160] 

(View GenBank report,FASTA reporvASN.l report,Graphical view.l MEDLINE link, or 2 protein links) 
M34832 

Bacteriophage phi-1 1 integrase (int) and excisionase (xis) genes, complete cds 
gi| 166 1 57|gb|M34832|BPHINTX3S [ 1 66 1 57] 

(View GenBank report,FASTA reporvASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 



38 



M20394 

Bacteriophage phi-1 1 S.aurcus attachment site (attP) 
gi| 1 66 1 56|gbiM20394|BPHATTP [ 1 66 1 56] 

(View GeaBank report,FASTA report, ASN.l report.Graphical view, 1 MEDLINE link, or 4 nucleotide neighbors ) 
X23128 

Bacteriophage phi- 1 3 Lntegrase gene 
gi|758228|cmb|X82312|PHII3INT [758228] 

(View GenBank report,FASTA report,ASN. 1 report.Graphical view, 1 protein link, or 3 nucleotide neighbors ) 
X61719 

S.aureus phi- 1 3 lysogen right chxomosome/bacteriophage DNA junction 
gi!46625|emb|X6 1 7 1 9|S AP 1 3RJNC [46625] 

(View GenBank report,FASTA report,ASN.l report.Graphical view, or 1 MEDLrNE link ) 
X61718 

S.aurcus phi-13 lysogen left chromosomal/bacteriophage DNA junction 
. gi|46624|emb|X61718|SAP13LJNC [46624] 
(View GenBank report,FASTA report,ASN.l report,Graphical view, or 1 MEDLINE link ) 

X61717 

Bacteriophage phi-13 core sequence for attachment 
gi| l4799|emb|X6 1 7 1 7|BP 1 3 ATTP [ 14799] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE links, or 3 nucleotide neighbors ) 
U01875 

Bacteriophage phi-13 putative regulatatory region and integrase (int) gene, partial cds 
gii437U8|gb|U01875|U01875 [437118] 

(View GenBank report.FASTA report, ASN. 1 report,Graphical view,3 MEDLINE links, or 4 nucleotide neighbors ) 
X67739 

S.aureus Bacteriophage phi-42 attP gene 
gi| 14809|emb|X67739|BPATTPA [14809] 

(View GenBank report,FASTA report^SN.i report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
U01872 

Bacteriophage phi-42 integrase (int) gene, complete cds 
gi|437ll5|gb|U01872|U01872 [437115] 

(View GenBank report,FASTA report^SN.l report,Graphical view,3 MEDLINE links. 2 protein links, or 3 nucleotide neighbors ) 
X94423 

Staphylococcus aureus bacteriophage phi-42 DNA with ORFs (restriction modification system) 
gi|l77l597|cmb|X94423|SARMS [1771597] 

(View GenBank report,FASTA reporg\SN.l report,Graphical view,2 protein links, or I nucleotide neighbor ) 
M27965 

Bacteriophage L54a (from S.aureus) int and xis genes, complete cds 
gi|2 1 5096|gb|M27965|L54INTXIS (2 15096] 

(View GenBank report,FASTA report^SKi report,Graphical view, MEDLINE 1 link, 2 protein links, or 3 nucleotide neighbors ) 
U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gill763241|gbIU72397|B8U72397 [1763241] 

(View GenBank report.FASTA report^SN.l report,Graphical view,2 protein links, or 2 nucleodde neighbors ) 
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AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gi|334l907|dbj|AB009866|AB009866 [3341907] 

(View GenBank report,FASTA report, ASN.l report.Graphical view.63 protein links, or I nucleotide neighbor ) 
Z47794 . 

Bacteriophage Cp-1 DNA, complete genome 
gi|2288892|emb|Z47794|BPCP!XX [2288892] 

(View GenBank report.FASTA report.ASN.l report,Graphical view,3 MEDLlNfc links, 28 protein links, 1 nucleotide neighbor, or 
1 genome link ) 

SEG_CP7RSiT 

Bacteriophage Cp-7 (S. pneumoniae) 5' inverted terminal repeat 
gi| 1 66 1 86|gb||SEG_CP7RSIT [1661 86] 

(View GenBank report.FASTA report,ASN.l report.Graphical view, or 1 MEDLINE link ) 
M11635 

Bacteriophage Cp-7 (S.pneumoniae) DNA, 3' inverted terminal repeat 
gi|166l85|gb|M11635|CP7RSIT2 [166185] 

(View GenBank report.FASTA report.ASN.1 report, or Graphical view) 
M11636 

Bacteriophage Cp-7 (S.pneumoniae) 5* inverted terminal repeat 
gi|l66184|gb|MH636|CP7RSITl [166184] 

(View GenBank report.FASTA report^ASN.l report, or Graphical view) 
SEG_CP5RSIT 

Bacteriophage Cp-5 (S.pneumoniae), 5* inverted terminal repeat 
gi| 1 66 1 8 1 |gb||SEG_CP5RSIT [ 1 66 1 8 1 ] 

(View GenBank report,FASTA report,ASN. I report.Graphical view, or 1 MEDLINE link ) 
M11633 

Bacteriophage Cp-5 (S.pneumoniae) 3' inverted terminal repeat 
gi|l66180|gb|M11633|CP5RSIT2 [166180] 

(View GenBank report.FASTA report.ASN.1 report, or Graphical view) 
Ml 1634 

Bacteriophage Cp-5 (S.pneumoniae), 5' inverted terminal repeat 
gi|166179|gb|M11634|CP5RSITl [166179] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

M34780 / . 

Bacteriophage Cp-9 rauramidase (cpl9) gene 
gi|l66l87|gb|M34780|CP9CPL [166187] 

(View GenBank report^ASTA report^ASN.i report,Graphical vicw.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M34652 

Bacteriophage HB-3 amidase (hbl) gene, complete cds 
gi|2l5055|gb|M34652|HB3HBLA [215055] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.l MEDLINE link, or 1 protein link ) 
U64984 

Streptococcus pyogenes phage T12 repressor, excisionase (xis), integrase(int) and erythrogenic toxin A precursor (speA) genes, 
complete cds gi| i877426|gb{U40453|SPU40453 [1877426] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,2 MEDLINE links, 4 protein links, or 22 nucleotide neighbors ) 
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X12375 

Phage CP-T I (Vibrio cholerae) DNA for packaging signal (pac site) 
gi|15435|emb|X12375|NCCPPAC (15435] 

(View GenBank report.FASTAreport.ASN.1 report.Graphical view.l MEDLINE link, or 1 protein link) 
AF087814 

V ™ C . h rt °J e J rae mamentousbacteri °P ha S«fs-2 DNA, complete genome sequence 
gi|37O2207|dbj|AB0O2632|AB002632 (3702207J sequence 

(View GenBanJc report.FASTA report.ASN.1 report,Grapoical view.l MEDLINE link, 9 protein links, or 1 genome link ) 
D83518 

B -n£ «S- ag * KVP4 ° gea4 f ° r ™ i0t "P sid P rotein Pf«ursor, complete cds 
gi!3046858|dbj|D835 1 8|D835 18 [3046858] 

(View GenBank report.FASTA repon.ASN. I report.Graphical view, 1 MEDLINE link, or 1 protein link ) 
AF033322 

(V,ew GenBank report,FASTA report^SN.l report,Graphicat view.l protein link, or 17 nucleotide neighbors ) 
X94331 

Bacteriophage L cro, 24, c2, and cl genes 

gi| 1 4692 1 3|emb|X9433 1 |BLCR024C ( 14692 1 3] 

(View GenBank report.FASTA report^SN.l report,Graphical view, 1 MEDLINE link, or 4 protein links ) 
U82619 

r 9 &~ btegme Md — e (xis) gencs - compicte 

(V,ew GenBank report,FASTA repon.ASN.1 report.Graphical view.l MEDLINE link, 8 protein links, or 1 nucleotide neighb 
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'Table 12 



NCBI Entrez Nucleotide QUERY 
Key words: bacteriophage and lysis 
56 citations found (all selected) 



AJ0U581 

Bacteriophage PS 1 19 lysis genes 13, 19, 15, and packaging gene 3, 
complete cds 

gii3676084JemblAJ01 1581IBPS01 1581 [3676084] 

(ViewGenBank report r FASTA reporU\SN.l report,Graphical view,4 protein 
links, or 1 nucleotide neighbor ) 

AJ011580 

Bacteriophage PS34 lysis genes 13, 19, 15, anti terminator gene 23, and 
packaging gene 3, complete cds 
gil3676O78iemblAJ0l 15801BPS01 1580 [3676078] 

(View GenBank report,FASTA report ,ASN.l report,Graphical view,5 protein 
links, or 2 nucleotide neighbors ) 



AJ011579 

Bacteriophage PS3 lysis genes 13, 19, 15, and packaging gene 3 
gil3676073lemblAJ01 1579iBPS01 1579 [3676073] 

(View GenBank reportFASTA reporV\SN.l report^Graphical view,4 protein 
links, or 1 nucleotide neighbor ) 



AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (O), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-LA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975l [2668751] 

(View GenBank reportrFASTA repor^ASN.l report.Graphical view, I MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 



U373I4 

Bacateriophage lambda Rzl protein precursor (Rzl) gene, complete cds 
gi!1017780igWU37314JBLU37314 [1017780] 

(View GenBank report,FASTA report^ASN.l report^Graphical view,2 MEDLINE 
links, I protein link, or 9 nucleotide neighbors ) 



U00005 

E coli hflA locus encoding the hflX, hflK and hflC genes, hfq gene, 
complete cds; miaA gene, partial cds 
gi!436153lgblU00005IECOHFLA [436153] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,4 MEDLINE 



links, 5 protein links, or 8 nucleotide neighbors ) 



U32222 

Bacteriophage 186, complete sequence 
gil3337249lgblU32222IB 1U32222 [3337249] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AF064539 . . 

Bacteriophage N15, complete genome 
gil3192683lgblAF064539IAF064539 [3192683] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view.2 MEDLINE 
links. 60 protein links, 26 nucleotide neighbors, or 1 genome link ) 



AF063097 

Bacteriophage P2. complete genome 

gil3 l39086lgWAF063097IAF063097 (3 139086] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view.21 MEDLINE 
links. 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys. hoi. intG. rad.and tec genes 

gil2707950lemblZ97974IBPHIADH [2707953] ^ 

(View GenBank report.FASTA reporUASN.l report.Graphical view.2 MEDLINE 
links. 9 protein links, or I nucleotide neighbor ) 



AF059243 

Bacteriophage NL95. complete genome 
g il3088545lgblAF059243IAF059243 [3088545] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.2 MEDLINE 
links, 4 protein links, 3 nucleotide neighbors, or 1 genome link ) 



AF052431 

Bacteriophage Ml 1 A-proteio, coat protein. Al-protein, and replicase 
genes, complete cds 

gil2981208lgWAF05243ll [2981208] ■ , x 

(View GenBank report,FASTA repor^ASN.l report,Graphical view.2 MEDLINE 

links. 4 protein links, or 8 nucleotide neighbors ) 



Y07739 

Staphylococcus phage Twort holTW. plyTW genes 
gil2764979lemblY07739IBFTWGHOLG [2764979] 
(View GenBank report JFASTA report.ASN.1 report.Graphical view, or 2 
protein links ) 



X94331 



Bacteriophage L cro, 24, c2, and cl genes 

gil 14692 l3lemblX94331!BLCR024C [1469213] 

(View GenBank report,FASTA reporUSN.i report.Graphical view.l MEDLINE 
link, or 4 protein links ) 



X78410 

Bacteriophage phiadh holin and lysin genes 
gil793»48lemb!X78410ILGHOLLYS [793848] 

(View GenBank report.FASTA repor^ASN.l report.GraphicaLview.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



X99260 

Bacteriophage B103 genomic sequence 
gili429229iemb!X99260IBB103G [1429229] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 17 protein links, or 12 nucleotide neighbors ) 



AJ000741 

Bacteriophage PI darA operon 
gil2462938IemblAJ000741IBPAJ7641 [2462938J 

(View GenBank report,FASTA report.ASN.1 report.Graphical view.l MEDLINE 
link, 10 protein links, or 3 1 nucleotide neighbors ) 



X87420 

Bacteriophage ESI8 genes 24, c2, cro. cl, 18. and oL and oR operators 
gilll43407lemblX87420IBPESl8GEN [1 143407] 

(View GenBank report,FASTA report,ASRl report.Graphical view.5 protein 
links, or 9 nucleotide neighbors ) 



135561 

Bacteriophage phi-105 ORFs 1-3 
gi!532218lgblL35561IPH50RFHTR [532218] 

(View GenBank report,FASTA rcport.ASN.l report.Graphical view,l MEDLINE 
link, or 3 protein links ) 



D10027 

Group II RNA coliphage GA genome 
gi!217784JdbjlD100271PGAXX [217784] 

(View GenBank report^FASTA rcporUASN.l report,GraphicaI view.l MEDLINE 
link, 3 protein links, 5 nucleotide neighbors, or 1 genome link ) 



V01 128 

Bacteriophage phi-X174 (cs70 mutation) complete genome 
gili5535lcmblV01128IPffiX174 [15535] 

(View GenBank report,FASTA report^\SN.l report,Graphical view,4 MEDLINE 
links, 1 1 protein links, or 26 nucleotide neighbors ) 



S81763 



coat gene...replicase gene [bacteriophage KU1, host=Escherichia coli, 
group II RNA phage, Genomic RNA, 3 geaes, 120 nt] 
gill438766lgblS81763IS81763 [1438766] 

(View GenBank report.FASTA report,ASN.l report.Graphical view, or I 
MEDLINE link) 



U38906 

Bacteriophage rlt integrase, repressor protein (rro), dUTPase.liolin and 
lysin genes, complete cds 
gill353517lgblU38906IBRU38906 [1353517] 

(View GenBank report.FASTA report,ASN.l report,Graphica] view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-Gl DNA cos region 

gil 1 107473lemblX91 149IAPHIC3 1C [1 107473] 

(View GeoBank report,FASTA reportASN.l report,Graphical view.l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor ) 



V00642 

phage MS2 genome 

g ill5081lemblV00642iLEMS2X [15081] 

(View GenBank report,FASTA report ASN.l report,Graphical view,8 MEDLINE 
links, 4 protein links, or 20 nucleotide neighbors ) 



V01 143 

Genome of bacteriophage T7 

gil43 1 187lembl V01 146T7CG [43 1 187] 

(View GenBank report JASTA report,ASN.l report,Graphical view, 13 MEDLINE 
links, 60 protein links, 105 nucleotide neighbors, or 1 genome link ) 



X78401 

Bacteriophage P22 right operon. orf 48, replication genes 18 and 12, nin 
region genes, ninG phosphatase, late control gene 23, orf 60, complete- 
cds, late control region, start of lysis gene 13 
gil512343lemWX78401IPOP22NlN [512343] 

(View GenBank report,FASTA reportASN.l report.Graphical view,2 MEDLINE 
links, 13 protein links, or 4 nucleotide neighbors ) 



Y00408 

Bacteriophage T4 gene t for lysis protein 
gil 1 53 68!embTr 004081 MYT4T [15368] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) 



Z26590 



Bacteriophage mv4 lysA and lysB genes 
gil410500lemblZ26590IMV4LYSAB [410500] 

(View GenBank report, FAST A reporuASN.l report.Graphical view, or 4 
protein links ) 



X07809 

Phage phiX 174 lysis (E) gene upstream region 
gili5094lemblX07809IMIPHlXE [15094] 

(View GenBank report.FASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) p ** 



Z34528 

Lactococcal bacteriophage c2 lysin gene 
gil506455lemb!Z34528ILBC2LYS[N [506455] 

(View GenBank report,FASTA report ASN.l report.Graphical view, I MEDLINE 
link, I protein link, or 4 nucleotide neighbors ) 



X15031 

Bacteriophage fr RNA genome 
gill507llemblX15031ILEBFRX [15071] 

(View GenBank report.FASTA reportASN.i report,Graphical view.l MEDLINE 
link, 4 protein links, 9 nucleotide neighbors, or 1 genome link ) 



X8019I 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase 
proteins 

gil517237lemWX8019llBPP7PR [517237] 

(View GenBank' report.FASTA reporuASN.l report,Graphical view, I MEDLINE 
link, 4 protein links, or 1 genome link ) 



X85010 

Bacteriophage A51 1 ply5t i gene 
gi!853748lemblX85010(BPA5llPLY [853748] 

(View GenBank reportJASTA report^\SN.l report f Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A500 hol500 and plySOO genes 
gil853744lemblX85009IBPA500PLY [853744] 

(View GenBank report,FASTA reporCASN.i report,Graphical view.l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X 85008 

Bacteriophage A 1 18 hoi 1 18 and plyl 18 genes 
gilS53740lemWX85008IBPA 1 18PLY [853740] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



Z35638 



Bacteriophage phi-X174 genes for lysis protein and beta-lactamase 
gi!520996lemblZ35638IBPLYSPR (520996] 

(View GenBank report.FASTA report^ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 516 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, complete genome * 
gil2i5l04lgblJ02459iLAMCG [215104] 

(View GenBank report,FASTA report.ASN.l report.Graphical view.87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 



X87674 

Bacteriophage PI lydA & lydB genes 
gil974763!emblX876741BACPlLYD [974763] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 



X87673 

Bacteriophage PI gene 17 
gil97476llemblX876731BACPll7 [974761] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 1 nucleotide neighbor ) 



M 14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis 
protein and DNA packaging proteins, complete cds 
gi!215810lgblM14784IFT3RE [215810] 

(View GenBank report,FASTA report^ASN.l reporUGraphical view.l MEDLINE 
link, 9 protein links, or 10 nucleotide neighbors ) 



M11813 

Bacteriophage PZA (from B.subtilis), complete genome 
gi!216046lgblM11813IPZACG [216046] 

(View GenBank reportJFASTA reportASRl report,Graphical view3 MEDLINE 
links, 27 protein links, 17 nucleotide neighbors, or 1 genome link ) 



M16812 

Bacteriophage K3 V lysis gene, complete cds 
gil215503!gblM16812IPK3LYST [215503] 

(View GenBank report,FASTA repoOASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 4 nucleotide neighbors ) 



J04356 

Bacteriophage P22 proteins 15 (complete cds), and 19 (3' end) genes 
gil 2 1 52651 gbl J043 561 P22 1 5P [215265] 



J04343 



SS**? JF34 P^in genes, complete cds. and 

replicase protein gene, 5' end ' ■ 

gil215076loblJ04343IJP3COLY [215076] 

(View GenBank report.FASTA repor^ASN.l report.Graphical view 1 MEDLINE 
link. 3 protein links, or 2 nucleotide neighbors ) ,V,rapD,caJ v,<iW ' 1 MfcDLlNE 



J02482 



Bacteriophage phi-X174. complete genome 
gil2l6019lgblJ02482IPXlCG [216019] 



inlTi i nB f! k r5R° rt .FASTA report ASN.l report.Graphical view.23 MEDLINE 
links. 11 protein links. 26 nucleotide neighbors. £ 1 genomelink ) UNE 



M99441 



ly^e^nd WA) gene, complete cds and 

gil2 15820lgWM99441IPT4ASIA [2 15820] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view J MEDLINE 
links. 2 protein links, or 2 nucleoli neighbors ) VJrapiucw v,ew>;j *"^LINE 



M65239 



®2ffi28!^E AU. ys,S 8 encs S - *• lnd **• complete cds 
gil2l5466lgWM65239IPH2LYSGEN [215466] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view 1 MEDLINE 
link. 3 protein links, or 1 nucleotide neighbor ) Mfc.DL.INE 



M10637 



qysfs < )?roS£ 0VerIaPPiD8 8CDe SyStem * enCOdiDg ° < mot P ho S e n^c) and E 
gil2154271gblMl0637IPG4DE [215427] 

lint ? S?"? r 5P° rt »f ASTA reportASN.1 report.Graphical view.l MEDLINE 
link. 2 protein links, or 12 nucleotide neighbors) 



J02454 



Bacteriophage G4, complete genome 
gil215415lgWJ024541PG4CG [215415] 

iXTn ""^ r ".P 0 ^^- 1 rcport,Graphical vicw.6 MEDLINE 

links. 1 1 protein links. 20 nucleotide neighbors, or 1 genome link ) 



J02580 



Bacteriophage PA-2 (Ecoli porcine strain isolate) Rz gene. Send- ORF2 
outer membrane porin protein (1c) and 0RF1 genes comolete cds ' 
gil215366JgblJ02580IPA2LC [215366] & ' P 

GenBank report.FASTA reportASN.l report.Graphical view.l MEDLINE 
link. 4 protein links, or 4 nucleotide neighbors) 



Ml 4782 



Bacillus phage phi -29 head morphogenesis, major head protein, head fiber 

protein, tail protein, upper collar protein, lower collar protein, 

pre-neck appendage protein, morphogenesis(!3), lysis, morphogenesis(15), 

encapstdation genes, complete cds 

gi!215323lgblM14782IP29LATE2 [215323] 

(View GenBank report, FAST A rcport.ASN.1 report,Graphica! view, I MEDLINE 
link, 1 1 protein links, or 1 1 nucleotide neighbors ) 



M 10997 

Bacteriophage P22 lysis genes 13 and 19. complete cds 
gil2l5262lgblMl0997IP221319 [215262] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view.l MEDLINE 
link, 2 protein links, or 3 nucleotide neighbors ) 



J02467 

Bacteriophage MS2, complete genome 
gil2l5232lgblJ024671MS2CG [215232] 

(View GenBank rcport.FASTA report,ASN.l report,Graphical view,8 MEDLINE 
links, 4 protein links, 20 nucleotide neighbors, or 1 genome link ) 



M14Q35 

Bacteriophage lambda lysis S gene with mutations leading to nonlcthality 
of S in the plasmid pRGl 
gil215180lgbiM14035LAMLYS [215180] 

(View GenBank reportJASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 1 protein link, or 14 nucleotide neighbors ) 



U04309 

Bacteriophage phi-LC3 putative holin QysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gi!530796IgblUM309iBPU04309 [530796] 

(View GenBank report,FASTA report^\SN.l report/jraphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Table 13 



NCBI Entrez Nucleotide QUERY 

Key word: holin 

51 citations found (all selected) 

AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), 0 
protein (0), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975l [2668751] 

(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

U52961 

Staphylococcus aureus holin-like protein LrgA (IrgA) and LrgB (lrgB) 
genes, complete cds 

gi!1841516lgblU52961ISAU52961 [1841516] 

(View GenBank report,FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



U28154 

Haemophilus somnus cryptic prophage genes, capsid scaffolding protein 
gene, partial cds, major capsid protein precursor, endonuclease, capsid 
completion protein, tail synthesis proteins, holin, and lysozyme genes, 
complete cds 

gill765928lgblU28154IHSU28154 [1765928] 

(View GenBank reportJFASTA report ASN.l report.Graphical view.l MEDLINE 
link, or 13 protein links ) 



AF032122 

Streptococcus thermophilus bacteriophage Sfil9 central region of genome 
gil2935682lgblAF032122l [2935682] ' 
(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 14 protein links, or 2 nucleotide neighbors ) 

AF032121 

Streptococcus thermophilus bacteriophage Sfi21 central region of genome 
g il2935667lgblAF032121IAF032121 [2935667] 

(View GenBank report.FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 14 protein links, or 2 nucleotide neighbors ) 



AF021803 



Bacillus subdlis 168 prophage SPbeta N-acetylmurarnoyl-L-alanine amidase 
(blyA), holm-like protein (bhlA), holin-like protein (bhJB), and yolK 
genes, complete cds; and yolJ gene, partial cds 
gil2997594lgblAF021803IAFO218O3 [2997594] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 5 protein links, or 1 nucleotide neighbor ) 



AF057033 

Streptococcus thermophilus bacteriophage sfil I gp502 (orf502), *p284 
(orf284), gpl29 (orfl29), gpl93 (orfl93), gpl 19 (orf 1 19), gp348 
(orf348), gp53 (orf53), gpl 13 (orf 113), gp 104 (orf 104), gpH4(orfll4), 
gpl28 (orf 128), gpl68 (orf 168), gpl 17 (orf i 17), gpl05 (orf 105), putative 
minor tail protein (orf 1510), putative minor structural protein 
(orf512), putative minor structural protein (orf 1000), gp373 (orf373), 
gp57 (orf57), putative anti-receptor (orf695), putative minor structural 
protein (orf669) f gpl49 (orf 149), putative holin (orf 141), putative 
holin (orf87), and lysin (orf288) genes, complete cds 
gi!3320432lgblAF057033lAF057033 [3320432] 

(View GenBank report,FASTA report ASN.l report,Graphical view,25 protein 
links, or 1 nucleotide neighbor ) 



U32222 

Bacteriophage 186, complete sequence 
gi!3337249lgbtU32222IB 1U32222 [3337249] 

(View GenBank report,FASTA report,ASN.l report,Graphical view f 6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gil3341907ldbjlAB009866IAB009866 [3341907] 

(View GenBank report^FASTA report,ASN.l report,Graphical view,63 protein 
links, or 1 nucleotide neighbor ) 



AF009630 

Bacteriophage ML170, complete genome 
gil3282260lgblAF009630IAF009630 [3282260] 

(View GenBank report^FASTA report»ASN.l report.Graphical view,63 protein 
links, 3 nucleotide neighbors, or 1 genome link) 



AF064539 

Bacteriophage N15, complete genome 



gil3 192683lgblAF064539IAF064539 [3192683] 

(View GenBank report,FASTA report,ASN.i report.Graphical view,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or 1 genome link ) 



AF063097 

Bacteriophage P2, complete genome 

gi!3 139086lgblAF063097IAF063097 [3 139086] 

(View GenBank report,FASTA report >ASN.l report'.Graphical view,21 MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad.and tec genes 
gil2707950lemblZ97974IBPHIADH [2707950] 

(View GenBank report.FASTA report ,ASN.l report,Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



X95646 

Streptococcus thermophilus bacteriophage Sfi21 DNA; lysogeny module, 
8141 bp 

gil22927471emblX95646IBSFI2lLYS [2292747] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE 
links, 19 protein links, or 3 nucleotide neighbors ) 



SEGJXHLYSINO 

Bacteriophage LL-H structural protein gene, partial cds; minor 
structural protein gp61 (g57), unknown protein, unknown protein, 
structural protein (g20), unknown protein, unknown protein, major capsid 
protein (g34), main tail protein gpl9 (gl7), holin (hoi), muramidase 
(mur), unknown protein, unknown protein, unknown protein, unknown 
protein, unknown protein, and unknown protein genes, complete cds; 
unknown protein gene, partial cds; and unknown protein, unknown protein, 
unknown protein, unknown protein, unknown protein, minor structural 
protein gp75 (g70), minor structural protein gp89 (g88), minor 
structural protein gp58 (g71), unknown protein, unknown protein, unknown 
protein, and unknown protein genes, complete cds 
gil 1004337lgbIISEG_LLHLYSIN0 [ 1004337] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,4 MEDLINE 
links, 3 1 protein links, or 1 nucleotide neighbor ) 



M96254 

Bacteriophage LL-H holin (hoi), muramidase (mur), and unknown protein 
genes, complete cds 

gil 10O4336lgblM96254ILLHLYSIN03 [1004336] 

(View GenBank report,FASTA report ASN.l report, or Graphical view) 



Y07740 



Staphylococcus phage 187 ply 187 and hoi 187 «enes 
gil2764982lemblY07740IBP187PLYH (2764982] 
(View GenBank report.FASTA reportASN.l report.Graphical view, or 2 
protein links ) 



U88974 

Streptococcus thermophilus bacteriophage 01205 DNA sequence 
gil2444080lgblU88974t [2444080] 

(View GenBank report.FASTA report ASN.l report,Graphical view.l MEDLINE 
link, 57 protein links, or 6 nucleotide neighbors ) 

Z99117 

Bacillus subtilis complete genome (section 14 of 21): from 2599451 to 
2812870 

gil2634966lemblZ991 17IBSUB0014 [2634966] 

(View GenBank report.FASTA report ASN.l report,Grapbical view.233 

protein links, 51 nucleotide neighbors, or 1 genome link ) 

Z99115 

Bacillus subtilis complete genome (section 12 of 21): from 2195541 to 
2409220 

gil2634478lemblZ99115IBSUB00 12 [2634478] ' 

(View GenBank report,FASTA reportASN.l report,Grapbical view.244 

protein links, 64 nucleotide neighbors, or 1 genome link ) 

Z99110 

Bacillus subtilis complete genome (section 7 of 21): from 1194391 to 
1411140 

gil2633472lemblZ991 10IBSUB0007 [2633472] 

(View GenBank report^ASTA reportASN. 1 report,Grapbicai view,226 

protein links, 3 1 nucleotide neighbors, or 1 genome link ) 



X78410 

Bacteriophage phiadh holin and lysin genes 
gil793848lemblX78410ILGHOLLYS [793848] 

(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Z93946 



Bacteriophage Dp- 1 dph and pal geaes and 5 open reading frames 
gill934760lemblZ93946IBPDPlORFS [1934760] 
(View GenBank report.FASTA reportASN.l report.Graphical view, or 6 
protein links) 



AFO 11378 

Bacteriophage ski complete genome 
gil2392824lgblAF01 1378IAF01 1378 [2392824] 

(View GenBaak report.FASTA reportASN.l report.Graphical view ,54 protein 
links, 2 nucleotide neighbors, or 1 genome link ) 



Z47794 

Bacteriophage Cp-1 DNA, complete genome 
gil2288892lemblZ47794IBPCPlXX [2288892] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,3 MEDLINE 
links, 28 protein links. 1 nucleotide neighbor, or 1 genome link ) 



L35561 

Bacteriophage phi- 105 ORFs 1-3 

gil5322 1 8lgblL3556 1 IPH50RFHTR [5322 18] 

(View GenBank report.FASTA report .ASN.l report,Graphical view.l MEDLINE 
link, or 3 protein links) 



D49712 

Bacillus licheniformis DNA for ORFs, xpaL2 homologous protein and xpaLl 
homologous protein, complete and partial cds 
gill514423ldbjlD49712ID49712 [1514423] 

(View GenBank report,FASTA reportASN.l report,Graphical view,2 MEDLINE 
links, or 4 protein links ) 



X905U 

Lactobacillus bacteriophage phig 1 e DNA for Rorf 162, Holin, Lysin, and 
Rorf 175 genes 

gill926386lerablX9051 1ILBPHIHOL [1926386] 

(View GenBank report,FASTA report ASN.l report.Graphical view,4 protein 
links, or 1 nucleotide neighbor ) 



X98106 

Lactobacillus bacteriophage pbigle complete genomic DNA 
gill926320lemblX98106ILBPHIGlE [1926320] 

(View GenBank report,FASTA report,ASN.i report,Graphical view, I MEDLINE 



link, 50 protein links, or 4 nucleotide neighbors ) 



U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gill763241lgblU72397IB8U72397 (1763241) 

(View GenBank report,FASTA report,ASN.l report.Graphical view.2 protein 
links, or 2 nucleotide neighbors ) m 



U38906 

Bacteriophage rlt integrase, repressor protein (rTo), dUTPase, holin and 
lysin genes, complete cds 
gill353517lgblU38906IBRU38906 [1353517] 

(View GenBank report JASTA report.ASN.1 report.Graphical view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C3 1 DNA cos region 

gill 107473lemblX91 1491APHIC31C [1 107473] 

(View GenBank report,FASTA report,ASN.l report.Graphical view.l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor ) 



U24159 

Bacteriophage HP1 strain HPlcl, complete genome 
gil l046235lgblU24159IBHU24159 (1046235] 

(View GenBank report,FASTA report ASN.l report,Graphical view,6 MEDLINE 
links, 41 protein links, 8 nucleotide neighbors, or 1 genome link ) 



Z26590 

Bacteriophage mv4 lysA and IysB genes 
gil41050CHerablZ26590IMV4LYSAB (410500] 

(View GenBank report.FASTA report .ASN.l report,Graphical view, or 4 
protein links ) 



Z70177 

B.subtilis DNA (28 kb PBSX/skin element region) 
gill225934lemblZ70177IBSPBSXSE [1225934] 

(View GenBank report,FASTA report ASN.l report,Graphical view32 protein 
links, or 4 nucleotide neighbors ) 



Z36941 



B.subtilis defective prophage PBSX xhIA, xhlB, and xylA genes 
gil535793lemblZ36941IBSPBSXXHL [535793] 

(View GenBank report,FASTA report ,ASN. I report.Graphical view,4 protein 
linkSi or 5 nucleotide neighbors ) 



X89234 



L.innocua DNA for phagelysin and holin gene 
gill 134844lemblX89234ILICPLYHOL [1134844] 

(View GenBank report.FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 

X85010 

Bacteriophage A51 1 ply51 1 gene 
gil8537481emblX85O10IBPA51 1PLY [853748] 

(View GenBank report.FASTA report ASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

X85009 

Bacteriophage A500 hol500 and ply500 genes 
gil853744lemblX85009IBPA500PLY [853744] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE 
link. 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage A 118 hoi 118 and ply 118 genes 
gil853740lemblX85008IBPAl 18PLY [853740] 

(View GenBank reportJFASTA report.ASN.1 report.Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

L34781 

Bacteriophage phi 1 1 holin homologue (ORF3) gene, complete cds and 
peptidoglycan hydrolase (lytA) gene, partial cds 
gil51 1838lgblL34781IBPHHOUN [51 1838] 

(View GenBank reportJFASTA report.ASN.1 report,Graphical view.l MEDLINE 
link. 4 protein links, or 2 nucleotide neighbors ) 



U11698 

Serratia marcescens SM6 extracellular secretory protein (nucE), putative 
phage lysozyme (nucD), and transcriptional activator (nucC) genes, 
complete cds 

gil509550lgblUl 16981SMU1 1698 [509550] 

(View GenBank report JASTA report .ASN.l report.Graphical view.l MEDLINE 



link, 3 protein links, or 1 nucleotide neighbor ) 



U31763 



Serratia marcescens phage-holin analog protein (regA), putative phage 
lysozyrne (regB), and transcriptional activator (regC) genes, complefe 
cds ' r 

gil965068lgblU31763ISMU31763 [965068] 

(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X87674 



Bacteriophage PI lydA & lydB genes 
gil974763lemblX87674IBACPlLYD (974763] 

(View GenBank report.FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 



L48605 



Bacteriophage c2 complete genome 

gil 1 146276lgblL48605IC2PVCG (1 146276] 

(View GenBank report.FASTA report ASN.l report.Graphical view.3 MEDLINE 
links, 39 protein links. 3 nucleotide neighbors, or 1 genome link ) 



L33769 

Bacteriophage WL67 DNA polymerase subunit (ORF3-5), essential 
recombination protein (ORF13). lysin (ORF24), minor tail protein 
(ORF31), terminase subunit (ORF32). holin (ORF37), unknown protein (ORF 
1-2,6-12,14-23,25-3033-36), complete genome 
gil522252lgblL33769IL67CG [522252] 

(View GenBank report^ASTA report,ASN.l report,Graphical view.l MEDLINE 
link, 37 protein links, 2 nucleotide neighbors, or 1 genome link ) 

L31348 

Bacteriophage Tuc2009 integrase (int) gene, complete cds; lysin (lys) 
gene, 3' end 

gil508612lgblL3 1348ITU2INT [508612] 

(View GenBank report^ASTA report,ASN.l report.Graphicai view.2 MEDLINE 
links, 3 protein links, or 3 nucleotide neighbors ) 

L31364 

Bacteriophage Tuc2009 holin (S) gene, complete cds; lysin (lys) gene, 
complete cds 

gil496281tgblL3 1364ITU2SLYS [496281] 



(View GenBank report,FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



L31366 

Bacteriophage Tuc2009 structural protein (mp2) gene, complete cds 
gil496278lgblL3 1366ITU2MP2A [496278] 

(View GenBank report,FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) - 



L31365 

Bacteriophage Tuc2009 structural protein (mpl) gene, complete cds 
gi!496276lgblL3 1365ITU2MP1 A [496276] 

(View GenBank report,FASTA reportASN.l report.Graphical view.l MEDLINE 
link, or 1 protein link ) 



U04309 

Bacteriophage phi-LC3 putative holin (lysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gil530796lgblU04309IBPU04309 [530796] 

(View GenBank report.FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Table 14 



NCBI Entrez Nucleotide QUERY 
Key word: bacteriophage and kil 
5 citations found (all selected) 



AF034975 



Bacteriophage H-19B essential recombination function protein (erf) kil 
protein (kil), regulatory protein cIII (cIII), protein gp 17 ( 17) N 
protein (N), cl protein (cl), cro protein (cro). cll protein (cllj 0 
protein (O). P protein (P). ren protein (ren), Roi (roi), Q protein (Q), 
Sniga-hke toxin A (slt-IA) and B (slt-IB) subunits. and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975l [2668751] 

(View GenBank . report.FASTA reportASN.l report.Graphical view.l MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

X15637 

B .ffit riopha » e ?22 °P«ro° encompassing ral, 17, kil and arf genes 
gill5646lemblX15637IPOP22PL (15646] 8 
(View GenBank report.FASTA report ASN.l report.Graphical view.l MEDLINE 
link, 7 protein links, or 2 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, complete genome 
gil2151041gblJ02459ILAMCG [215104] 

(View GenBank report,FASTA report .ASN.l report,Graphical view,87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 

M64097 

Bacteriophage Mu left end 
gil215543lgblM640971PMULEFTEN [215543] 

(View GenBank report^ASTA report,ASN.l report,Graphical view.2 MEDUNE 
links, 39 protein links, or 15 nucleotide neighbors ) 



M 18902 

Bacteriophage D108 kil gene encoding a replication protein, 3' end; and 
containing three ORFs, complete cds 
gill66191lgblM18902ID18KIL [166191] 

(View GenBank report^ASTA report,ASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) 
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Table 16 

Phage 44AHJD complete genome sequence. 16668 nucleotides. 

1 tccatttctt tactaaactt aaaaatgctg tgcaacaact taaccaactt atctaaccta ttacatattc 
71 atcaaataca aaatttatgt atctattgac ttttattcaa aattatgatt tcaacatata ataaaattaa 
141 tttacttatt taaatattct atgatataat tagttataaa atatttggag gtgtataaat gacagaattt 
211 gatgaaatcg taaaaccaga cgacaaagaa gaaacttcag "aatcaactga agaaaattta gaatcaactg 
281 aagaaacttc agaatcaact gaagaatcaa ctgaagaatc aactgaagaa tcaactgaag ataaaacagt 
351 agaaacaatc gaagaagaaa atgaaaacaa attagaacct actacaacag atgaagatag ttcgaaattt 
421 gaccctgttg tattagaaca acgtattgct tcattagaac aacaagtgac tactttttta tcttcacaaa 
491 tgcaacaacc acaacaagta caacaaacac aatcagatgt aacagaatca aacaaagaag ataacgacta 
561 ttcagatgaa gaactagttg ataagttaga tttagattag gaggaattta aacatgtatg agggaaacaa 
631 catgcgttct atgatgggta catcatatga agattcaaga ttaaataaac gaacagaatt . aaatgaaaac 
701 atgtcaattg atacaaataa aagtgaagat agttatggtg tacaaattca ttcactttca aaacaatcat 
771 ttacaggtga cgttgaggag gaataataaa ttatggcaca acaatctaca aaaaatgaaa ctgcactttt 
841 agtagcaaag tcagctaaat cagcgttaca agattttaat catgattatt caaaatcttg gacatttggc 
911 gacaaatggg ataattcaaa tacaatgttc gaaacatttg taaataaata tttattccct aagattaatg 
981 agactttatt aatcgatatt gcattaggta atcgttttaa ttggttagct aaagagcaag attttattgg 
1051 acaatatagt gaagaatacg tgattatgga cacagtacca attaacatgg acttatctaa aaatgaggaa 
1121 ttaatgttga aacgtaatta tccacgtatg gcaactaagt tatatggtaa cggaattgtg aagaaacaaa 
1191 aattcacatt aaacaacaat gatacacgtt tcaatttcca aacattagca gacgcaacta attacgcttt 
1261 aggtgtatac aaaaagaaaa tttctgatat taatgtatta gaagaaaaag aaatgcgtgc aatgttagtt 
1331 gattactcat tgaatcaatt atccgaaaca aatgtacgta aagcaacatc aaaagaagat ttagcaagca 
1401 aagtttttga agcaatccta aacttacaaa acaacagtgc taaatataat gaagtacatc gtgcatcagg 
1471 tggtgcaatt ggacaatata caactgtatc aaaattaaaa gatattgtga ttttaacaac agattcatta 
1541 aaatcttatc ttttagatac taagattgca aacacattcc agattgcagg cattgatttc acagatcacg 
1611 ttattagttt tgacgactta ggtggcgtgt " ttaaagtaac aaaagaattt aagttacaaa accaagattc 
1681 aattgacttt ttacgtgcgt atggagatta tcaatcacaa ttaggagata caattccagt tggtgctgta 
1751 tttacttatg atgtatctaa acttaaagag tttactggca acgttgaaga aattaaacca aaatcagatt 
1821 tatatgcgtt tattttggat attaattcaa ttaaatataa acgttacaca aaaggtatgt taaaaccacc 
1891 attccataac cctgaatttg atgaagttac acactggatt cattactatt catttaaagc cattagtcca 
1961 ttctttaata aaattttaat tactgaccaa gatgtaaatc caaaaccaga ggaagaatta caagaataaa 
2031 aggagcgtaa aatatgaaca acgataaaag aggtttaaac gttgagttat caaaggaaat cagcaaaaga 
2101 gttgttgaac atcgcaacag atttaaacgt cttatgttta atcgttattt ggaattctta ccgctactaa 
2171 tcaactatac caatcgtgat acggttggta tagattttat tcagttagaa tcagctttaa gacaaaacat 
2241 taatgtagtt gttggtgaag ctagaaataa gcaaattatg attcttggtt atgtaaataa cacttacttt 
2311 aatcaagcac caaatttttc atcaaacttt aatttccaat ttcaaaaacg attaactaaa gaagatatat 
2381 attttattgt acctgactat ttaatacctg atgattgtct acaaattcat aagctatatg ataactgtat 
2451 gagtggtaac tttgttgtca tgcaaaataa accaattcaa tataatagtg atatagaaat tatagaacat 
2521 tatactgatg aattagcaga agttgcttta tctcgctttt ctttaatcat gcaagcaaaa tttagcaaga 
2591 tatttaaatc agaaattaat gacgagtcaa tcaatcaact tgtgtccgaa atatataacg gtgcaccatt 
2661 tgttaaaatg tcacctatgt ttaatgcaga tgacgatatc attgatttaa caagtaatag cgtaatccca 
2731 gcattaactg aaatgaaacg ggaatatcaa aacaaaatta gtgaattaag taactattta ggcattaatt 
2801 cattagccgt tgataaagaa agcggtgttt cagacgaaga ggcaaaaagt aatcgtggat ttaccacatc 
2871 aaacagtaat atctatttaa aaggtcgtga accaattacg tttttatcaa agcgttatgg tttagatatt 

2 941 aaaccgtatt acgatgatga aacaacgtct aaaatatcaa tggtagacac actttttaaa gatgaaagca 
3011 gtgatataaa tggctagata cacaatgact Ctatacgatt tcattaaatc agaattgatt aaaaaaggtt 
3081 tcaatgaatt tgtaaatgat aataaattaa cgttttatga tgatgaattt caattcatgc aaaaaatgct 
3151 gaagttcgac aaagacgttt tagctatcgt taatgaaaaa gtatttaaag gtttttcatt gaaagatgaa 
3221 ttatcagatt tactttttaa aaaatcattt acgattcatt ctttagatag agaaatcaac agacaaacag 
3291 ttgaagcatt tggcatgcaa gtgattactg tatgtattac acatgaggat tatttaaatg tggtttattc 
3361 atcaagtgaa gttgaaaaat acttacaatc acaaggcttc acagaacaca atgaagatac aacaagtaac 
3431 actgatgaaa catcgaatca aaatgctaca tctttagaca attcaactgg catgactgca aacagaaacg 
3501 cttatgtgtc attaccacaa agtgaggtta acattgatgt tgataataca acgttacgat tcgctgataa 
3571 taatacgatt gataacggta aaactgtgaa taaatcgagt aacgaaagta atcaaaacgc aaaacgtaat 
3641 caaaatcaaa aaggtaatgc aaaaggtaca caattcacta agcagtattt aattgataat attgataaag 
3711 cgtacgattt aagaaagaaa attttaaatg aatttgataa aaaatgtttt ttacaaattt ggtagaggtg 
3781 gttaaataat ggcatataat gaaaacgatt ttaaatattt tgatgacatt cgtccatttt tagacgaaat 
3851 ttataaaacg agagaacgtt atacaccgtt ttacgatgat agagcagatt ataatactaa ttcaaaatca 
3921 tattatgatt atatttcaag attatcaaaa ctaattgaag tattagcacg tcgtatttgg gactatgaca 
3991 atgaattaaa aaaacgtttc aaaaattggg acgacttaat gaaagcattt ccagagcaag cgaaagactt 
4061 atttagaggt tggttaaacg acggtacgat tgacagtatt attcatgacg agtttaaaaa atatagcgca 
4131 ggattaacat cggcatttgc tttatttaaa gttactgaaa tgaaacaaat gaatgacttt aaatcagaag 
4201 ttaaagactt aattaaagat attgaccgtt tcgttaatgg gtttgaatta aatgagcttg aaccaaagtt 
4271 cgtgatgggc tttggtggta ttcgcaacgc agttaaccaa tctattaata ttgataaaga aacaaatcac 
4341 atgtactcta cacaatccga ttctcaaaaa cctgaaggtt cttggataaa taaattaaca cctagtggtg. 
4411 acttaatttc aagcatgcgt attgtacagg gtggtcatgg tacaacaatc ggattagaac gtcaatccaa 
4481 tggtgaaatg aaaatctggt tacatcacga tggtgttgca aaactgttac aagtcgcata taaagataat 
4551 tatgtattag atttagaaga ggctaaaggt ttaacagatt atacaccaca gtcactttta aacaaacaca 
4621 cacttacacc gttaattgat gaagcaaatg acaaactcat tttaagattc ggtgacggaa caatacaggt 
4691 tcgttcaaga gcagacgtaa aaaatcacat tgataatgta gaaaaagaaa tgacaattga taattcagaa 
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4761 aacaatgata atcgttggat gcaaggcatt gctgttgatg gtgatgattt atactggtta agtggtaaca 

4831 gttcagttaa ttcacatgtt caaatcggta aatattcatt aacaacaggt caaaagattt atgattatcc 

4901 atttaagtta tcatatcaag acggtattaa tttcccacgt gataacttta aagagcctga gggtatttgc 

4 971 atttatacaa atccaaaaac aaaacgtaaa tcgttattac ttgctatgac aaacggcggt ggtggaaaac 

5041 gtttccataa tttatatggt ttcttccaac ttggtgagta tgaacacttt gaagcattac gcgcaagagg 

5111 ttcacaaaac tataaattaa caaaagacga cggtcgtgca ttatctattc cagaccatat cgacgattta 

5181 aatgacttaa cgcaagctgg tttttattat attgacgggg gtactgcaga aaaacttaag aatatgccaa 

S251 tgaatggtag caagcgtata attgacgctg gttgtttcat taatgtatac cctacaacac aaacattagg 

5321 tacggttcaa gaattaacac gtttctcaac aggtcgtaaa atggttaaaa tggtgcgtgg tatgacttta 

53 91 gacgtattta cgttaaaatg ggattatgga ttatggacaa caatcaaaac tgacgcacca tatcaagaat 

54 61 atttggaagc aagtcaatac aataactgga ttgcttatgt aacaacagct ggtgagtatt acattacagg 
5531 taaccaaatg gaattattta gagacgcgcc agaagaaatt .aaaaaagtgg gtgcatggtt acgtgtgtca 
5601 agtggtaacg cagtcggtga agtaagacaa acattagagg ctaatatatc ggaatataaa gaattcttca 
5671 gtaatgttaa tgcggaaaca aaacatcgtg aatatggttg ggtagcaaaa catcaaaaat aggagtgata 
5741 taaatgaaat cacaacaaca agcaaaagaa tggatatata agcatgaggg ggcaggtgtt gactttgatg 
5811 gtgcatatgg atttcaatgt atggacttat cagttgctta tgtgtattac attactgacg gtaaagttcg 
5881 catgtggggt aatgctaaag acgcgataaa taatgacttt aaaggtttag cgacggtgta taaaaataca 
5951 ccgagcttta aacctcaatt aggggacgtt gctgtatata caaatggaca atatggacat attcaatgtg 
6021 tgttaagtgg aaatcttgat tattatacat gcttagaaca aaactggtta ggcggcggtt ttgacggttg 
6091 ggaaaaagca accattagaa cacattatta tgacggtgta actcacttta ttagacctaa attttcaggt 
6161 agtaatagca aagcattaga aacatcaaaa gtaaatacat ttggaaaatg gaaacgaaac caatacggca 
6231 catattatag aaatgaaaat ggtacattta catgtggttt tttaccaata tttgcacgtg tcggtagtcc 
6301 aaaattatca gaacctaatg gctattggtt ccaaccaaac ggttatacac catataacga agtttgttta 
6371 tcagatggtt acgtatggat tggttataac tggcaaggca cacgttatta tttaccagtg cgccaatgga 
6441 atggaaaaac aggtaatagt tacagtgttg gtattccttg gggggtgttc tcataatggg tattttagcc 
6511 tttttctttg aatttagttg gaaaagatac aaataagagg tgtaaacaat ggctgataga atcgtaagaa 
6581 gtttaagaca agttgaaaca attgaacgtt tattggagga aaaaaatgag aaagttaacg aattttaagt 
6651 ttttctataa cacaccgttt acagactatc aaaacacgat tcattttaat agtaataaag aacgtgatga 
6721 ttatttttta aatggtcgtc attttaaatc gttagactat tcaaaacaac cgtataattt tatacgtgat 
6791 agaatggaaa tcaatgttga tatgcagtgg catgacgcac aaggtattaa ctacatgacg tttttatcag 
6861 attttgagga tagaagatat tacgcttttg taaaccaaat cgaatacgtg aatgacgttg tggttaaaat 
6931 atattttgtc attgatacca ttatgacgta tacacaaggg aatgtattag agcaactctc aaacgtcaat 
7001 attgaacgtc aacatttatc aaaacgcacg tataactata tgttaccaat gttacgtaat aatgatgatg 
7071 tgttaaaagt atcaaataaa aactatgttt ataaccaaat gcaacaatat ctggaaaatt tagtattatt 
7141 ccagtcaagc gctgatttat caaagaaatt tggtactaaa aaagagccaa acttagatac gtcaaaaggt 
7211 acgatttatg acaatatcac atcaccagtc aacttatacg ttatggaata tggtgacttt attaacttta 
7281 tggataaaa.t gagtgcctat ccatggatta cgcaaaactt tcaaaaggtt caaatgttac ctaaagactt 
7351 tattaataca aaagacttag aggacgttaa aaccagtgaa aaaattacag gattaaaaac attaaaacag 
7421 ggtggtaaat caaaagaatg gagtctaaaa gatttatcat taagtttctc aaatcttcaa gagatgatgt 
7491 tatctaaaaa agatgaattt aaacatatga tacgtaatga gtatatgaca attgaatttt atgactggaa 
7561 tggaaatacg atgttactcg acgctggtaa gatttcacaa aaaactggtg ttaagttacg tacaaaatca 
7631 attattggtt atcataatga agttcgagta tatccagtag attataacag tgctgaaaac gacagaccaa 
7701 tactcgctaa aaataaagaa atattgattg atacgggttc attcttaaat acaaatataa catttaatag 
7771 ttttgcacaa gtaccaatat taatcaataa tggtatctta ggacaatcac aacaagccaa ccgacaaaaa 
7841 aatgcagaaa gtcaattaat tacaaatcgt attgataatg tattaaatgg tagcgacccg aaatcacgct 
7911 tttatgacgc tgtgagtgta gcaagtaatt taagtccaac tgctttattt ggtaagttta atgaagaata 
7981 taatttctac aaacaacaac aagctgaata taaagattta gccttacaac caccttctgt aactgaatca 
8051 gaaatgggca acgcattcca aattgcgaat agcattaacg gtttaacgat gaaaattagt gtaccgtcac 
8121 ctaaagaaat tacattttta caaaaatatt atatgttgtt tggttttgaa gtgaatgact ataattcatt 
8191 tattgaacca attaacagta tgactgtttg caattattta aaatgtacag gtacgtatac tatacgtgac 
8261 atcgacccca tgttaatgga acaattaaaa gcaattttag aatctggtgt aagattttgg cataatgacg 
8331 gttcaggtaa tccaatgtta caaaatccat taaataacaa atttagagag ggggtataat atgaacgaag 
8401 taaaattcag atttacagac tcagaagcgt ttcacatgtt tatatacgct ggggatttaa aattactcta 
8471 ctttttattt gtattaatgt tcgttgatat tattacaggt atttcaaaag caattaaaaa taataactta 
8541 tggtcaaaaa aatcaatgag aggattttct aaaaaattat tgatattctg tattatcatt ttagcaaaca 
8611 tcattgacca gattttacaa ttaaaaggtg gtctactcat gattacaata ttttattata ttgcaaatga 
8681 gggactttct attgtagaaa attgtgcaga aatggacgta ttagtaccag aacaaattaa agataaatta 
8751 agagtcatta aaaatgatac tgaaaagagt gataacaatg aacgatcaag agaagataga taaatttacg 
8821 cattcctata ttaatgatga ttttggttta acgatagacc agttagtccc taaagtaaaa ggatatgggc 
8891 gctttaatgt atggcttggt ggtaatgaaa gtaaaatcag acaagtatta aaagcagtaa aagagatagg 
8961 tgtttcacct actctttttg ccgtatatga aaaaaatgag ggttttagtt ctggacttgg ttggttaaac 
9031 catacgtctg cacgtggtga ttatttaaca gatgctaaat tcatagcaag aaagttagta tcacaatcaa 
9101 aacaagctgg acaaccgtct tggtatgacg caggtaacat cgtccacttt gtaccacaag acgtacaaag 
9171 aaaaggtaat gcagattttg caaaaaatat gaaagcaggt acaattggac gtgcatatat tccattaaca 
9241 gcagctgcta cttgggcggc atattatcct ttaggtttga aagcatcata taacaaagta caaaactatg 
9311 gtaatccatt tttagacggt gcgaatacta ttctagcttg gggtggtaaa ttagacggta aaggtggatc 
9381 acctagtgat tcgtctgaca gtggtagtag tggtgacagt ggtagttcac tactcgcttt agcaaaacaa 
94S1 gccatgcaag aattattaaa aaaaatacaa gacgcattac aatgggacgt tcatagtatt ggtagtgata 
9521 aattttttag taatgattat tttacattag aaaaaacatt taacaacaca tatcatatta aaatgacgat 
9591 tggtttactt gattcattaa aaaaactgat tgatagcgtt caagtagata gtgggagtag tagttctaat 
9661 cctactgatg atgacggaga ccataaacca attagtggta aatcagtcaa gccaaatgga aaaagtggtc 
9731 gtgtgattgg tggtaactgg acatatgcac agttaccaga aaaatataaa aaagcaattg gtgtaccttt 
9801 attcaaaaaa gaatacttat acaaaccagg taacatattt cctcaaacgg gtaatgcagg acaatgtaca 
9871 gaattaacat gggcgtatat gtcacaacta catggtaaaa gacaacctac cgacgacggt caaataacaa 
9941 acggtcagcg tgtatggtac gtctataaaa agttaggtgc aaaaacaaca cataatccaa cagtaggtta 
10011 tggtttctct agtaaaccac catacttaca agcaactgca tatggtattg gtcacacagg tgttgttgta 



287 



10081 gcagtttttg aagatggttc gtttttagtt gcaaactata atgtaccacc atatgttgca ccatcacgtg 

10151 tggtattgta tacactcatt aatggcgtac caaataatgc tggtgataat attgtattct ttagtggtat 

10221 tgcttaatta actatgctat aatgaacaca tgctagtaat gctagtaaat aaaatacaaa acataatcaa 

10291 ttttcgtaca catttttcat gttatctcaa aaagaaaagg agactgttat tttaacagtt gccttttttt 

10361 atttcatcat gttcacgttt taatatatgc aaatcagatt tgttatgtac tgaacgttca actggaaata 

10431 agtcgttaag tgaaaatgaa ccgatgtcac tttcaatata aagaatatca tcaaattgac tatggtcgaa 

10501 attttctcta gcgtctttta atataaattc acgtttcata ttaagttcat cagtaaaata ttcatcatat 

10571 acattaccac atacaatttc agttttagac ggatatatcg atattgtacc ttgctcatta tagatacttt 

10641 tattgttttc aataatggca ccgtcaaaga attgttcacg tacaaaggtt tcaaaatcga cgcttgtatc 

10711 aaaggcgttt ttcggtatac cagcagaagc aattttaatc tttccattca cttcatatgc atatttctta 

10781 tgattcagta caaacatctt atctatctgt tcgttttcaa tatcccattt acctaaggct atcgggtcga 

10851 ataaactggg gttcaataag ggtttaacaa cggatttcat . atacaaacta tcagtatcgc aataaataaa 

10921 attgtcgtca atttcacttt ccgttaagta ttggaaagga accaataagt tatacaatga acgtgatgtg 

10991 acaaatgtag agaataatat attacgttca gtgtttttgt aaccgttaat gatattgtat agttcattgt 

11061 tatcatctaa acggaataag ttaaaatgtg aacgtaatgc aggtatgcca tataatccat ttaaaacgac 

11131 tttagataac ataacctcct catttgagta tgggtgttcg ttgatatcat cagtaatgtg atagtcgtaa 

11201 ggtgatgtca tattgatttt gttttttaac ttaccttgtg ttttaataaa atagttttga aaaataatat 

11271 cacgtgcatg aaagtattca cattcatata taacaaacga attaacacgt atatgcatgc aatcaatacc 

11341 cgtaatgtct tgaatcattc ttaatgtatt tgtattgata ttaacgtaat cattatcatt attatagtat 

11411 tttacaatca tttgacgtaa tacacgtgat ttaattttaa ttaataaatc atcgttaaat acatctttat 

114 81 caatcttata taatgaaaaa taattgtcat catctaaaaa agtagggatt aacgttggtt ctgaatagtg 

11551 ttcgtaaaag tataaccatg ttggaatttt ttcatgatac atcacataag gataactcga attgatgtca 

11621 atagaaaaac aaggctcatc aattagcttg tttatgtatt tggtgttata catatttaaa ccaccacgat 

11691 agaatgattt aatatagtca taaaaattca tatcatggaa atgataatgt gtataagata ttttaatatc 

11761 ttgatattgg ttgagtaact gaaaacgtgt catttcatta ttcaagtaag attccataat attcaatgaa 

11831 aatgttaatc tgttatagtc aaaatttgga aatatatcac tataatgaat atggcacata cctaatataa 

11901 tcacgtcatt atgaatgtat gtaagttgtt caggtgtgag ttttgcaaaa catttcacag catagtcata 

11971 ggcttcacta tcattcatat cattatcttt atcaaaaatc gtataattaa aatctgtttt aagttgtgat 

12041 tctgttaaat aaccaccatc aagtaatttc ttacctaatg ttgcaattga tgtattggtt ttcataaagt 

12111 tatcaataat attaaattta aaaccattta aaaacattgt taaatctaaa ttgattgaag atttaacacg 

12181 tttttctaaa attacatttt gatttttggc taaaatagta gcctctttca tttttaatgt gtgttcattt 

122S1 tcttctgcag attttaaata tatattttcg cgtgtaatat tatcaaaata acgcatggtg tctttaagta 

12321 aaaaatgatt atcgtattta ttacagttat gtgcaatcat gataatatct gtttttgatt ttgtgattgt 

123 91 atcacgtctt ttcacatacg tataaaatgc gtcataaaaa gattcgaaac tcggaaatac ttcaacatca 

12461 atttcataac cattaaacca accaattgct acagaataag taacgttttt atatttggtt ggtttttttc 

12531 gtccgttaac tttattgtac gctaatgttt ctatatccca gtataaaatc attcgacgtt catgtttatg 

12601 atattgcatg cattctagta atcccataat cttacacacc ttttataagc catattgttt cattagatac 

12671 tttttcgtat tctctatata gttatcttcg tatatttttt cttttctttc aaactcactc atatttttct 

12741 tcatttcatt ttttatatga aattttataa ttttattcat atctaaatat aaatatctat cattatcaac 

12811 cacgtaattt ttagagtaag cattgtcaaa atgtaaattg cttggattgt agtaataacg ttccatgttt 

12881 tctttataaa acatatcatc acgtaaatag gtaacatgat tgtctatatc cctaatttta gtacaaaatt 

12951 catattgttt tgtatatggt acaacgataa tatttgtcat aaaagtagtt acattataca tgactttaat 

13021 atatttatca tcagttttga tatagaagaa atcaccgttt tgattgatgt gatttcttaa attatcatcc 

13091 gccaaattat attcgttaaa ttcaaattct ccagttgtca tagcgtcgtc atttgaatta aacgcacgtg 

13161 tgttacgttt ttcattcacg taatcgtttc gtcgcatttc taaaaaaatg tttttgtaaa gtcttgatgt 

13231 attcatttta tgcttttgta ataaattgta tatatttaaa ttggataata taggacttga aaagttgact 

13301 gcattaccta gtaaaaacat tttagggaat ccaatataat caacgttacc atggttacgg tcgattgatt 

13371 catatattgt ttttaactta tcccactcat caattaaata atcatcttca agtgctaaaa actcatcata 

13441 tataataata ggatagtgtt ttaaaaagtt agaatgatat tttaaatcag tggcactatt caaatctgta 

13511 atcacaccaa tttctttatc ttgatagata atagctaaat agtccctagc acttctgaac gtgacacgtt 

13581 ttgatttaaa tagtggattt tcatctatga tttcttcaat aaaatcacgg taagcgtcac gtaatgtata 

136 51 atgacgtgat aataaagtaa attttatatc aagtttaata gctaaataaa taaaaaatga aacatagttg 
13721 aacgattttc catcagaacg gtttgaaata gatatataat aatctatatc atcattcata agttcatcaa 

137 91 ctaattctat ttgattatac ttatctggga ttttttttct gacatgattg acagcatttt gataatctct 
13861 taccatgtct aaacgatttt gttttaccat gtttttgctc cttgtaatag tttatgatgt cgtttacagt 
13931 gttaaattta ttcgtcaaat gttgcataat ataaaaagtt atacctcaca tcttcatcat caatatttgt 
14001 cactggtcta tctgatttac caatttcttt atataaagta tcgatttctt taatatattt atacattgaa 
14071 gaattattat ttttagcttg taaattatat aaagcgtatt tatgcttttt agcgttttta ttattagaat 
14141 catcattacg gttatatatt tcaagaatat aatttaattt tttatgtctt gaacctctta ccaatgatac 
14211 agcatttaca tatgatacgt ttctttcttt aggaaaatag ggcagatgtg caaaatgttt ccatgtgtca 
14281 atgtacgcct cttgtaaatc tttatcatca aatttaaaat taacattact aaaatcattt aaaaataaat 
14351 ctttttcttg ctcttttcta gcttctcttt cttttttcca tctatccatt tcagacgtat gtctaaccaa 
14421 tgttatcaac ctccatataa agcataaata accattaaaa agataatata gaatataatc aatgtagtga 
144 91 ataaaacacc aaatgacacg cgtatatgca gtgtcataag tatgataagt gtaattaaaa atgctaaaag 
14561 gaaaacaatg gctatgttta ataggttatt catggtcaat cactttccca ttatcgtata tgactttgtt 
14631 ttgataaata atcattaatt cgctttcaag aggtttatca aaatttgata atacgtcgtc aattgtaacg 
14701 tttaataaaa tttctcttat taattcatta cttaaataat ttctataata aaatacaagt atattaaaaa 
14771 catgtttttt aatatcaatg tcgatatcta acgtaaataa ctctttttca atttcaaaat catcatattg 
14841 tttgtcaaac tcaatataca catcacccat atttattttt actatacatt ttttattaga tgaagtaaat 
14911 ttttcaaatt tatcattata ataatctcta tttgttaaaa ggtaataaat taaattattt aatctaaaag 
14981 tagttttaat tttcattttt atatctcctt aatgtattct atgatatacg cgtatttttt agtgaacagg 
150S1 ttatattcat aatatgaata tacaacttta gcgtcatata aatcttcaaa cattgagatt tgatgtggaa 
15121 aatgtccttt aatctcatcg caatataata ataccgtttt gtatttacgt tccatttaaa cacctcataa 
15191 aaaatagggg ataagtatcc cctatgaaat tgtattaaaa tgatacttga ccaaaattga ttaagtaacc 
15261 tttttgacct tttttgtttt catattcata aattgtgaat tgaacttctc cagcattgat aatgtcaaca 
15331 acgtcctcat ctgctctcat ttctttaatt aattctgtta agtggttcgg taagtttacg ttatagtcat 
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1S401 cagtgacgat aacaccttgt tcaccgaatt 

15471 ttttttcata ccgtattttt ctactaattc 

15541 aatctcgcta atgtgttttg gtgtcttgat 

15611 ttaaattatt tgctttctgc aattgcgatt 

15681 tgcgtgtagt ggacaatagt ttacatgtgc 

15751 ctcgtgaagt ggtaaaaatt cctcaatgta 

15821 acacgtaagg taacaatgtc gtcaactttc 

15891 cgtttcataa aatcctttat gcatattcca 

15961 gattctggtt tagtttcgtt gtttagttca 

16031 atagttgttg gcaagccgat aataagttaa 

16101 tttattgaat agttgcaaca tttcagtata 

16171 attattatca cttcctaata aagttgaaat 

16241 tcaatgtcaa catcataaaa tgaaatttca 

16311 tcttaaaacg aaaaacatgc ttcaactcaa 

16381 tgattacata cttagtatag caaacgttta 

164 51 ttttaaaact actatttaat agaagaaata 

16521 agatacataa attttgtatt tgatgaatat 

16591 ttttaagttt agtaaagaaa tgataagtaa 

16661 ggtggggt 



ttgattcttt gtttgtgaat aatgctctaa cgatatactc 
tgatagtttg ataaattctc tttctttttc ctcaaattca 
aaaatatctt ttacgtttgt cattttattt ctcctcttat 
tgtagtaaat cattgtaata aacttgaatt gttttcgttg 
ctggtaataa ttcttttgct tgtgttttgg ttaaatgata 
ttcattatca tcatctaagt aatgaagtat ataacctttg 
attattatat cactcctttc taaaaaacgt aaacgttata 
ttgttctatt gggtcatcac cagcaatata agacaatatt 
tcatttaaga attgaacaac agaactatta tagtttaata 
ttgcattgtc aaatgtataa gctggattcc attgaatcag 
ggcttgtcct ttttcttctg gtgcattatc aacattaacc 
tacgcgtaaa^acagaattat gatttaaatc ttcaatttca 
ttttctgttc tatcaaataa cgctatacat aaacttccat 
tgttttttgt ttcattttcc atttttgtta ctccttgttt 
aaagttttgt caatagtttt tcttaaaaaa gtttaaataa 
agattttaag ttcaaatcat aattttgaat aaaagtcaat 
gtaataggtt agataagttg gttaagttgt tgcacagtat 
atttataagt tttgatttgt ataatcgttt attttaaacc 
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Table 17 



Phage 44AHJDORFs list 



nb 


Name i 


Frame 


Position 


Size (a.a.) 


Key words 


1 


44AHJDORF001 


-1 


10342..12627 


761 


DNA polymerase; 


2 


44AHJDORF002 


3 


3789.-5732 


647 


Techoic add; Staph; 


3 


44AHJDORF003 


2 


6626..8389 


587 


Tail; 


4 


44AHJDORF004 


1 


8764.. 10227 


' 487 


Serine protease motif; 


5 


44AHJDORF005 


-1 


12643.. 13890 


415 




6 


44AHJDORF006 


2 


803..2029 


408 




7 


44AHJDORF007 


1 


2044..3027 


327 


Upper collar; 


8 


44AHJDORF008 


2 


3020..3775 


251 


Lower collar; 


9 


44AHJOORF009 


2 


S744..6496 


250 


Amidase; Staph; 


10 


44AHJDORF010 


-2 


13938.. 14420 


160 




11 


44AHJDORF012 


3 


8391. .881 3 


140 


Holin; 


12 


44AHJDORF013 


-2 


14586.. 14996 


136 




13 


44AHJDORF113 


1 


199.. 600 


133 




14 


44AHJDORF011 


-2 


15225..15593 


122 




15 


44AHJDORF114 


-2 


15870..16172 


100 




16 


44AHJOORF014 


3 


6243..6521 


92 




17 


44AHJDORF015 


1 


15403. .15645 


80 




18 


44AHJDORF016 


-1 


15616..15852 


78 




19 


44AHJDORF017 


-2 


10536.. 10757 


73 




20 


44AHJDORF018 


-1 


886.. 1098 


70 




21 


44AHJOORF019 


-2 


9630..9836 


68 




22 


44AHJOORF121 


-1 


16165..16362 


65 




23 


44AHJDORF020 


2 


13865.. 14053 


62 




24 


44AHJDORF123 


2 


. 614..796 


60 




25 


44AHJDORF021 


-2 


5634..5816 


60 




26 


44AHJDORF023 


-2 


6315..6494 


59 




27 


44AHJDORF024 


1 


14275..14451 


58 




28 


44AHJOORF025 


-3 


14999.. 151 75 


58 




29 


44AHJDORF026 


-3 


14426.. 14593 


55 




30 


44AHJOORF027 


1 


12916..13080 


54 




31 


44AHJDORF029 


-1 


15019.. 15183 


54 




32 


44AHJDORF028 


-3 


9071. .9235 


54 




33 


44AHJDORF030 


3 


14487..14648 


53 




34 


44AHJDORF031 


2 


11039.. 11191 


50 




35 


44AHJDORF135 


3 


693. .842 


49 




36 


44AHJDORF033 


-1 


3646..379S 


49 




37 


44AHJOORF032 


-2 


9306..9455 


49 




38 


44AHJDORF034 


-3 


14000..14146 


48 




39 


44AHJDORF035 


-3 


13811. .13957 


48 




40 


44AHJDORF036 


-3 


10019..10165 


48 




41 


44AHJOORF022 


-3 


8468..8611 


47 




42 


44AHJDORF037 


1 


14788..14931 


47 




43 


44AHJOORF038 


-2 


3528..3671 


47 




44 


44AHJDORF039 


3 


1743..1883 


46 




45 


44AHJDORF040 


2 


L 9740..9877 


45 




46 


44AHJOORF041 


2 


15836.115973 


45 . 




47 


44AHJDORF042 


-1 


5014..5151 


45 




48 


44AHJOORF043 


-1 


4402..4539 


45 




49 


44AHJDORF044 


-2 


12783..12917 


44 




50 


44AHJDORF149 


-2 


639..770 


43 




51 


44AHJDORF046 


1 


4891. .501 9 


42 




52 


44AHJDORF047 


4 

1 


1 iy 1 1,. izujy 






53 


44AHJDORF045 


2 


10655.. 10783 


42 




54 


44AHJDORF048 


-3 


1 521 2..1 5340 


42 




55 


44AHJDORF049 


3 


5784..5909 


41 




56 


44AHJOORF050 


3 


13158..13283 


41 




57 


44AHJDORF051 


-2 


10944..11066 


40 




58 


44AHJDORF052 


-3 


14216..14338 


40 




59 


44AHJDORF053 


3 


3348.-3467 


39 




60 


44AHJOORF054 


3 


7551. .7670 


39 




61 


44AHJDORF055 


3 


15705..15821 


38 




62 


44AHJOORF056 


1 


5512..5625 


37 




63 


44AHJOORF057 


2 


10121..10231 


36 




64 


44AHJDORF058 


3 


10767..10877 


36 
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65 


44AHJDORF164 


-1 


592..702 


36 




66 


44AHJOORF059 


-2 


8250..8360 


36 




67 


44AHJDORF060 


-2 


6147..6257 


36 




68 


44AHJDORF061 


2 


15551. .15658 


35 




69 


44AHJDORF062 


1 


428S..4389 


34 




70 


44AHJDORF063 


-3 


9383..9487 


34 




71 
72 
73 


44AHJDORF065 
44AHJDORF064 
44AHJDORF066 


1 
2 
-2 


5029..5130 
2609..2710 
10380..10481 


33 
33 
33 





291 



Table 18 



Predicted amino acid sequences 



44AHJDORF001 

12627 atgggattactagaatgcatgcaatatcataaacatgaacgtcgaatgattttatactgggatatagaaacattagcgtacaat 

1 MGLLECMQYHKHERRMILYWDIETLAYN 

12543 aaagttaacggacgaaaaaaaccaaccaaatataaaaacgttacttattctgtagcaattggttggtttaatggttatgaaatt 

29 KVNG RKK PT KYKNVTY^S VA I GW FNGYE I 

12459 gatgttgaagtatttccgagtttcgaatctttttatgacgcattttatacgtatgtgaaaagacgtgatacaatcacaaaatca 

5 7 DVEVFPS FESFYDAFYTYVKRRDTITKS 

12375 aaaacagatattatcatgattgcacataactgtaataaatacgataatcattttttacttaaagacaccatgcgttattttgat 

85 KTDIIMIAHNCNKYDNHFLLKDTMRYFD 

122 91 aatattacacgcgaaaatatatatttaaaatctgcagaagaaaatgaacacacattaaaaatgaaagaggctactattttagcc 

113 NITRENIYLKSAEENEHTLKMKEATILA 

12207 aaaaatcaaaatgtaattttagaaaaacgtgttaaatcttcaatcaatttagatttaacaatgtttttaaatggttttaaattt 

141 KNQNVILEKRVKSSINLDLTMFLMGFKF 

12123 aatattattgataactttatgaaaaccaatacatcaattgcaacattaggtaagaaattacttgatggtggttatttaacagaa 

169 NI I D N FM K TNT S I ATLG K K.L LDGGYLT E 

12039 tcacaacttaaaacagattttaattatacgatttttgataaagataatgatatgaatgatagtgaagcctatgactatgctgtg 

197 sqlkT DFNYTIFDKDNDMNDSEAYDYAV 

11955 aaatgttttgcaaaactcacacctgaacaacttacatacattcataatgacgtgattatattaggtatgtgccatattcattat 

225 KCFAKLTPEQLTYIHNDV I I LGMCHIHY 

11871 agtgatatatttccaaattttgactataacaaattaacattttcattgaatattatggaatcttacttgaataatgaaatgaca 

253 SDI FPNFDYNKLTFSLNI MESYLNNEMT 

11787 cgttttcagttactcaaccaatatcaagatattaaaatatcttatacacattatcatttccatgatatgaatttttatgactat 

281 R FQ LLNQYQD I KI SYTH YH F HDMNF YDY 

11703 attaaatcattctatcgtggtggtttaaatatgtataacaccaaatacataaacaaactaattgatgagccttgtttttctatt 

309 IKSFYRGGLNMYNTKYINKLIDEPCFSI 

11619 gacatcaattcgagttatccttatgtgatgtatcatgaaaaaattccaacatggttatacttttacgaacactattcagaacca 

337 D INS SYPYVMYHEKI PTWLYFYEHYSEP 

11535 acgtcaatccctacttttttagatgatgacaattatttttcattatataagattgataaagatgtatttaacgatgatttatta 

3 65 TLI pTFLDDDNYFSLYKIDKDVFNDDLL 

11451 attaaaattaaatcacgtgtattacgtcaaatgattgtaaaatactataataatgataatgattacgttaatatcaatacaaat 

393 i KIKSRVLRQMI VKYYN NDNDYVNINTN 

11367 acattaagaatgattcaagacattacgggtattgattgcatgcatatacgtgttaattcgtttgttatatatgaatgtgaatac 

421 TLRMIQDITGIDCMHIRVNS FVIYECEY 

11283 tttcatgcacgtgatattatttttcaaaactattttattaaaacacaaggtaagttaaaaaacaaaatcaatatgacatcacct 

449 FHARDI I FQNYFI KTQGKLKNKINMTSP 

11199 tacgactatcacattactgatgatatcaacgaacacccatactcaaatgaggaggttatgttatctaaagtcgttttaaatgga 

477 YD YHITDDINEHPYSNE EVMLSKVVLNG 

11115 ttatatggcatacctgcattacgttcacattttaacttattccgtttagatgataacaatgaactatacaatatcattaacggt 

SOS LYGIPALRSHFNLFRLDDNMELYNIING 

11031 tacaaaaacactgaacgtaatatattattctctacatttgtcacatcacgttcattgtataacttattggttcctttccaatac 

533 YKNTERNILFSTFVTSRSLYNLLVPFQY 

10947 ttaacggaaagtgaaattgacgacaattttatttattgcgatactgatagtttgtatatgaaatccgttgttaaacccttattg 

561 LTESEIDDNFIYCDTDS LYMKSVVKPLL 

10863 aaccccagtttattcgacccgatagccttaggtaaatgggatattgaaaacgaacagatagataagatgtttgtactgaatcat 

589 NPSLFDPIALGKWDIENEQID KMFVLNH 

10779 aagaaatatgcatatgaagtgaatggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgat 

617 KKYAYEVNGKIKIASAGI PKNAFDTSVD 

10695 tttgaaacctttgtacgtgaacaattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaaca 

645 FETFVREQFFDGAI IENN KS IYNEQGTI 

10611 tcgatatatccgtctaaaactgaaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatgaaacgtgaa 

673 S IYPSKTEIVCGNVYDEYFTDELNMKR E 

10527 tttatattaaaagacgctagagaaaatttcgaccatagtcaatttgatgatattctttatattgaaagtgacatcggttcattt 
701FILKDARENFDHSQFDDILYIESDIGSF 

10443 tcacttaacgacttatttccagttgaacgttcagtacataacaaatctgatttgcatatattaaaacgtgaacatgatgaaata 

729 S LMDLFPVERSVHNKSDLHI LKREHDEI 

10359 aaaaaaggcaactgttaa 10342 

757 K K G N C * 
44AHJDORF002 

3789 atggcatataatgaaaacgattttaaatattttgatgacattcgtccacttttagacgaaatttataaaacgagagaacgtcac 

1 MAYNEND FKYF DD I RP F LD E I YK TRERY 

3873 acaccgttttacgatgatagagcagattataatactaattcaaaatcatattatgattatatttcaagattatcaaaactaatt 

29 T pFYDDRA DYNTNSKSYYDY I SRLSKLI 

3957 gaagtattagcacgtcgtatttgggactatgacaatgaattaaaaaaacgtttcaaaaattgggacgacttaatgaaagcattt 

57 EVLARRIWDYDNELKKRFKNWDDLM KAF 

4041 ccagagcaagcgaaagacttatttagaggttggttaaacgacggtacgattgacagtattattcatgacgagtttaaaaaatat 

85 p EQAKD L.FRGWLNDGTI DS I IHDEFKKY 

4125 agcgcaggattaacatcggcatttgctttatttaaagttactgaaatgaaacaaatgaatgactttaaatcfagaagttaaagac 

113 S AG LTSAFALF KVTEMKQMND FKSEVKD 

4209 ttaattaaagatattgaccgtttcgttaatgggtttgaattaaatgagcttgaaccaaagtttgtgatgggctttggtggtatt 
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m likDIDRFVNGFELNELEPKFVMGFGGI 

taataaattaacacccagtggtgacttaatctcaagcatgcgcatcgtacagggcggtcatggtacaacaat- 

NKLTPSGDLI SSMRIVQGGHGTTI 



l«9 RMAVNQSINIDKBTNnniaiv----- - . 

ggtttttggataaataaattaacacctagtggtgacttaatttcaagcatgcgcaccgtacagggtggtcatggtacaacaacc 

G F W I NKLTPSGDLI SSMRIVQGGHGTTI 

ggattagaacgtcaacccaatggtgaaatgaaaatctggccacatcacgatggtgttgcaaaactgctacaagtcgcatataaa 

G LERQSMGEMKIWLHHDGVAKLLQVAYlt 

gataattatgtattagactcagaagaggccaaaggcctaacagattatacaccacagtcacttttaaacaaacacacatttaca 

DNYVLDLEEAKGLTDYTPQSLLNKHT F T 

ccgttaattgatgaagcaaatgacaaactcattttaagattcggtgacggaacaatacaggttcgttcaagagcagacgtaaaa 

P LIDEANDKLILRFGD, GTIQVRSRADV 

aatcacattgataatgtagaaaaagaaatgacaattgataactcagaaaacaatgataaccgctggatgcaaggcattgctgtt 

MHIDNVEKEMTIDMSENNDNRWMQGIAV 

gatggtgatgatttatactggttaagtggtaacagttcagttaattcacatgttcaaatcggtaaatattcattaacaacaggt 

nG DDLYWLSGNS SVNSHVQIGKYSLTTG 

LaLgatttatgattatccatttaagttatcatatcaagacggtattaaCttcccacgtgacaactttaaagagcctgagggt 

QKIYD YPFKL S Y Q D G I ■ * J R D J t J*'*^ 



4377 
197 
4461 
225 
454S 
2S3 
4629 
281 
4713 



4797 
337 

4881 

= rfrfrr rr rrrr ? -rrrrr rrrrrrr rrr 



tt nVFTLKWDYGLWTTi^*" rtr ^ ~ 



617 BAN 
5721 catcaaaaatag 5732 
645 H Q K 
44AHJDORF0O3 
6626 



55. ^tattcUcaagcg^ 

1S9 attLtgacLatcLaLaccagtcaacttatacgttatggaatatggtgaccttattaactttatggataaaatgagtgcc 
IYD MI TS PVML YV H_ E Y O 0 f I ■ W H D K M S A 



7214 
197 



sr H444rrfr rrrrrrrrrr rrrrrrrrr; 



7382 
253 
7466 

281 F 
7550 ta 
309 Y 
7634 a 
337 
7718 



LLaJttcaagagatgatgttatctaaaaaaga^ 
-ttggtLcataatgaagttcgagtatatccagCagattataacagtgctg^ 

gaaacattgattgatacgggttcattcttaaatacaaatataacatttaatagctttgcacaagtaccaatattaatcaataat 
; 6 30 2 ggtatc^aggacaaLa^aaLagccaaccgaLaLaLcgcagaaagtcaactaattacaaatcgtattga 

a. H4fi4™f™^ 
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505 LQKYYMLFGFEVNDYNSFIEPINSMTVC 

8222 aattatttaaaatgtacaggtacgtatactatacgtgacatcgaccccacgttaatggaacaattaaaagcaattttagaatct 

533 NYLK CTGTYTIRDIDPMLMEQLKAILES 

8306 ggtgtaagattttggcataatgacggttcaggtaatccaatgttacaaaatccattaaataacaaatttagagagggggtataa 

8389 

561 GVRFWHNDGSGNPMLQNPLNNKFREGV* 
44AHJDORF004 

8764 atgatactgaaaagagtgataacaatgaacgaccaagagaagatagataaatttacgcattcctatattaatgatgattttggt 

1 M I L K R V I T MN DQE K I DK FTH S Y I NDD FG 

884 8 ttaacgatagaccagttagtccctaaagtaaaaggatatgggcgctttaatgtatggcttggtggtaatgaaagtaaaatcaga 

29 LTIDQLVP KVKGYGRFNVWLGGNESKIR 

8932 caagtattaaaagcagtaaaagagataggtgtttcacctactctttttgccgtatatgaaaaaaatgagggttttagttctgga 

57 QVLKAVKE IGVS PTLFAVY EKNEGFSSG 

9016 cttggttggttaaaccatacgtctgcacgtggtgattatttaacagatgctaaattcatagcaagaaagttagtatcacaatca 

85 LGWLN H T S ARGDYLTDAKFIARKL VS QS 

9100 aaacaagctggacaaccgtcttggtatgacgcaggtaacatcgtccaccttgtaccacaagacgtacaaagaaaaggtaatgca 

113 KQAGQ PSWYDAGNIVHFV PQDVQRKGNA 

9184 gattttgcaaaaaatatgaaagcaggtacaattggacgtgcatatattccattaacagcagctgctacttgggcggcatattat 

141 D FA KN M KA.G T I GRAY I P LTAAATWAA YY 

9268 cctttaggtttgaaagcatcatataacaaagtacaaaactatggtaatccatttttagacggtgcgaatactattctagcttgg 

169 P LGLKASYNKVQNYGNPFLD.GANTI LAW 

9352 ggtggtaaattagacggtaaaggtggatcacctagtgattcgtctgacagtggtagtagtggtgacagtggtagttcactactc 

197 GGKLD GKGGSPSDSSDSGSSGDSGSSLL 

9436 gctttagcaaaacaagccatgcaagaattattaaaaaaaatacaagacgcattacaatgggacgttcatagtattggtagtgat 

225 ALAKQAMQ ELLKKIQDALQWDVHS I G SD 

9520 aaattttttagtaatgattattttacattagaaaaaacatttaacaacacatatcatattaaaatgacgattggtttacttgat 

253 KFFSN DYFTLEKTFNNTYHIKMTIGLLD 

9604 tcattaaaaaaactgattgatagcgttcaagtagatagtgggagtagtagttctaatcctactgatgatgacggagaccataaa 

281 SLKKLIDSVQVDSGSSSSNPTD DDGDHK 

9688 ccaattagtggtaaatcagtcaagccaaatggaaaaagtggtcgtgtgattggtggtaactggacatatgcacagttaccagaa 

309 piSGKSVKPNGKSGRVIGGNWTYAQLPE 

9772 aaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttacacaaaccaggtaacatatttcctcaaacgggtaat 

337 KYKKAIGVPLFKKEYLYKPGNIFPQTGN 

9856 gcaggacaatgtacagaattaacatgggcgtatatgtcacaactacatggtaaaagacaacctaccgacgacggtcaaataaca 

365 AGQC T E LTWAY MSQLHGKRQPT DDGQ IT 

994 0 aacggtcagcgtgtatggtacgtctataaaaagttaggtgcaaaaacaacacataatccaacagtaggttatggtttctctagt 

393 NGQRVWYVYKKLGAKTTHNPTVGYGFSS 

10024 aaaccaccatacttacaagcaactgcatatggtattggtcacacaggtgttgttgtagcagtttttgaagatggttcgttttta 

421 KPPYLQATAYGIGHTGVVVAVFEDGSFL 

10108 gttgcaaactataatgtaccaccatatgttgcaccatcacgtgtggtatcgtatacactcattaatggcgtaccaaataatgct 

44? VANYNVPPYVAPSRVVLYTLINGVPNNA 

10192 ggtgataatattgtattctttagtggtattgcttaa 10227 

477 GDHIVFPSG IA* 
4 4AHJDORF005 

13890 atggtaaaacaaaatcgtttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataagtat 

1 MVKQNRLD MVRDYQNAVNHVRKKI PDKY 

13806 aatcaaatagaattagttgatgaacttatgaatgatgatatagattattatatatctatttcaaaccgttctgatggaaaatcg 

29 NQI ELVDE LMNDDIDYYIS ISNRSDGKS 

13722 ttcaactatgtttcattttttatttatttagctattaaacctgatataaaatttactttattatcacgtcattatacattacgt: 

57 FNYV S F F I YLAIKLDI KFTLLSRHYTLR 

13638 gacgcttaccgtgattttattgaagaaatcatagatgaaaatccactatttaaatcaaaacgtgtcacgttcagaagtgctagg 

85 DAYRD F I E EI IDENPLF KSK'RVTF RSAR 

13554 gactatttagctattatctatcaagataaagaaattggtgtgattacagatttgaatagtgccactgatttaaaatatcattct 

113 DYLAI IYQDKEIGVITDLNSATDLKYHS 

13470 aactttttaaaacactatcctattattatatatgatgagtttttagcacttgaagatgattatttaattgatgagtgggataag 

141 NFLKHYPI IIYDEFLALEDDYLIDEWDK 

13386 ttaaaaacaatatatgaatcaatcgaccgtaaccatggtaacgttgattatattggattccctaaaatgtttttactaggtaat 

169 LKTIYES I DRNHGNVDYIGFPKMFLLGN 

13302 gcagtcaacttttcaagtcctacattatccaatttaaatatatacaatttattacaaaagcataaaatgaatacatcaagactt 

197 AVNFS S PI LSNLNIYNLLQKHKMNTSRL 

13218 tacaaaaacacttttttagaaatgcgacgaaacgattacgtgaatgaaaaacgtaacacacgtgcgtttaattcaaatgacgac 

225 YK N I F LEM RRNDYVNEKRNTRAFNSNDD 

13134 gctatgacaactggagaatttgaatttaacgaatataatttggcggatgataatttaagaaatcacatcaatcaaaacggtgat 

253 AMTTGEFEFNEYNLADDNLRNHINQNGD 

13050 ttcttctacatcaaaactgatgataaatatattaaagtcatgtataatgtaactacttttatgacaaatattatcgttgtacca 

281 F FY I KTDD KY IKVMYNVTTFM TNI IVVP 

12966 tatacaaaacaatatgaattttgtaccaaaattagggatatagacaatcatgttacctatctacgtgatgatatgttttataaa 

309 YTKQYEFCTKIRDIDNHVTYLRDDMFYK 

12882 gaaaacatggaacgttattactacaatccaagcaatttacattttgacaatgcttactctaaaaattacgtggttgataatgat 

337 EN MERYYYNPSNLHFDNAYSKNYVVDND 

12798 agatatttatatttagatatgaataaaattataaaatttcatataaaaaatgaaatgaagaaaaatatgagtgagcttgaaaga 

365 RYLYLD.MN KI IKFHI KNEMKKNMSEFER 

12714 aaagaaaaaatatacgaagataactatatagagaacacgaaaaagtatctaatgaaacaatatggcttataa 12643 

393 KEKIYEDNYIENTKKYLMKQYGL* 
44AHJDORF006 
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80 3 atggcacaacaatctacaaaaaatgaaactgcacttttagtagcaaagtcagctaaatcagcgttacaagattttaatcatgat 

1 MAQQSTKNETALLV AKSAKSALQDFNHD 

887 tattcaaaatcttggacatttggcgacaaatgggataattcaaatacaatgttcgaaacatttgtaaataaatatttattccct 

29 YSKSWTFGDKWDMSMTMFETPVNK YLFP 

971 aagattaatgagactttattaatcgatattgcattaggcaatcgttttaattggttagctaaagagcaagattttattggacaa 

57 KINETL* LIDIALGNRFNWLAKEQDFIGQ 

1055 tatagtgaagaatacgtgattatggacacagtaccaattaacatggacttatctaaaaatgaggaattaatgttgaaacgtaat 

85 Y S EEY.VIMDTVPINMDLS KNE ELMLKRN 

1139 tatccacgtatggcaactaagttatatggtaacggaattgtgaagaaacaaaaattcacattaaacaacaatgatacacgtttc 

113 Y P RMAT KLYGNGI VKKQ K FTLNNNDTRF 

1223 aatttccaaacattagcagacgcaactaattacgctttaggtgtatacaaaaagaaaatttctgatattaatgtattagaagaa 

141 N FQTLADATNYALGVY. KKK I S. D INVLEE 

1307 aaagaaatgcgtgcaatgttagttgattactcattgaatcaattatccgaaacaaatgtacgtaaagcaacatcaaaagaagat 

169 KEMRAMLVDYSLNQLS ETNVRKATSKE D 

1391 ttagcaagcaaagtttttgaagcaatcctaaacttacaaaacaacagtgctaaatataatgaagcacatcgtgcatcaggtggt 

197 LAS KVFEAI LNLQNNSAKYNEVHRASGG 

1475 gcaattggacaatatacaactgtatcaaaattaaaagatattgtgattttaacaacagattcattaaaatcttatcttttagat 

225 AIGQYTTVSKLKOIVI LTTDSLKSYLL D 

1559 actaagattgcaaacacattccagattgcaggcattgatttcacagatcacgttattagttttgacgacttaggtggcgtgttt 

253 T KI ANTFQIAGID FTDHVI S FDDLGGVF 

1643 aaagtaacaaaagaatttaagttacaaaaccaagactcaattgactttttacgtgcgtatggagattatcaatcacaattagga 

281 KVTKEF KLQ NQDSIDF LRAYGDYQSQLG 

1727 gatacaattccagttggtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaacca 

309 DT I PVGAVFTYDVS KLKE FTGNVEEI KP 

1811 aaatcagatttatatgcgtttactttggatattaattcaattaaatataaacgttacacaaaaggtatgttaaaaccaccattc 

337 KSDLYAF. ILDINS I KYKR YTKGMLKPPF 

1895 cataaccctgaatttgatgaagttacacactggattcattactattcatttaaagccattagtccattctttaataaaatttta 

365 HNPEFDEVTHWIHYYSFKAISPFFNKIL 

1979 attactgaccaagatgtaaatccaaaaccagaggaagaattacaagaataa 2029 

393 I TDQDVNPKPEEELQE * 
4 4AHJDORF007 

2044 atgaacaacgataaaagaggtttaaacgttgagttatcaaaggaaatcagcaaaagagttgttgaacatcgcaacagatttaaa 

1 MNNDKRGLNV.ELS KE I S KRVVEHRNRFK 
2128 cgtcttatgtttaatcgttatttggaatttttaccgctactaatcaactataccaatcgtgatacggttggtatagattttatt 
29 RLMFN RYLEFLPLLI NYTNRDTVGIDFI 
2212 cagttagaatcagctttaagacaaaacattaatgtagttgttggtgaagctagaaataagcaaattatgattcttggttatgta 
57 QL ESALRQNINVVVGEARNK. QIMILGYV 
2296 aataacacttactttaatcaagcaccaaatttttcatcaaactttaatttccaatttcaaaaacgattaactaaagaagatata 
85 NNTYFNQAPNFSSNFNFQFQKRLTKEDI 
2380 tattttattgtacctgactatttaatacctgatgattgtctacaaattcataagctatatgataactgtatgagtggtaacttt 
113 Y F I VPD YLI PDDCLQ IHKLYDNCMSGNF 
2464 gttgtcatgcaaaataaaccaattcaatataatagtgatatagaaattatagaacattatactgatgaattagcagaagttgct 
141 VVMQNKPIQYNSDIEI IEHY TDELAEVA 
2548 ttatctcgcttttctttaatcatgcaagcaaaatttagcaagatatttaaatcagaaattaatgacgagtcaatcaatcaactt 
169 LSRFS LIMQAKFSKI FKSE INDESINQL 
2632 gtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatgacgatatcattgatttaacaagt 
197 VS E I YNGAP FVKMS PMFNADDD I IDLTS 
2716 aatagcgtaatcccagcattaactgaaatgaaacgggaatatcaaaacaaaattagtgaattaagtaactatttaggcattaat 
225 NS V I PALTEMKREYQNKI S ELSNYLGIN 

2 800 tcattagccgttgataaagaaagcggtgtttcagacgaagaggcaaaaagtaatcgtggatttaccacatcaaacagtaatatc 
253 S LAVDKESG VSDEEAKSNRG FTTSMSNI 
2884 tatttaaaaggtcgtgaaccaattacgtttttatcaaagcgttatggtttagatattaaaccgtattacgatgatgaaacaacg 
281 YLKGREPI T F LSKRYGLD I KPYYD D'E T T 
2968 tctaaaatatcaatggtagacacactttttaaagatgaaagcagtgatataaatggctag 3027 

309 SKISMVDTLF K DESSDING * 
4 4AHJDORF008 

3020 atggctagatacacaatgactttatacgatttcattaaatcagaattgatcaaaaaaggtttcaatgaatttgtaaatgataat 

1 MARYTMTLYDFIKSELIKKGFNEFVND. N 

3104 aaattaacgttttatgatgatgaatttcaattcatgcaaaaaatgctgaagttcgacaaagacgttttagctatcgttaatgaa 

29 KLTFYDDEFQFMQKMLKFDKDVLAIVNE 

3188 aaagtatttaaaggtttttcattgaaagatgaattatcagatttactttttaaaaaatcatttacgattcattttttagataga 

57 KVFKGFSLKDELSDLL. FKKS FTIHFLDR 

3272 gaaatcaacagacaaacagttgaagcatttggcatgcaagtgattactgtatgtattacacatgaggattatttaaatgtggtt 

85 E INRQTVEAFGMQVITVC ITHEDYLNVV 

3356 tatccatcaagtgaagttgaaaaatacttacaatcacaaggcttcacagaacacaatgaagatacaacaagtaacactgatgaa 

113 YSSSEVEKYLQSQGFTEHNEDTTSNTDE 

3440 acatcgaatcaaaatgctacacctttagacaattcaactggcatgaccgcaaacagaaacgcttatgtgtcattaccacaaagt 

141 T SNQNATSLDNSTGMTANRNAYVSLPQS 

3524 gaggttaacattgatgttgataatacaacgctacgattcgctgataataatacgattgataacggtaaaactgtgaataaatcg 

169 EVN IDVDNTTLRFADNNT I DNGKTVNKS 

3608 agtaacgaaagtaatcaaaacgcaaaacgtaatcaaaatcaaaaaggtaacgcaaaaggtacacaattcactaagcagtattta 

197 SNESNQNAKRNQNQKGNAKGTQFTKQYL 

3692 attgataatattgataaagcgtacgatttaagaaagaaaattttaaatgaatttgataaaaaatgttttttacaaatttggtag 

225 S I DNIDKAYDLRKKILMEFDKKCFL*QIW* 
44AHJDORF009 
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5744 atgaaatcacaacaacaagcaaaagaatggatatataagcatgagggggcaggtgttgactttgatggtgcatatggatttcaa 

1 MKSQQQAKEWIYKHEGAGVDFDGAYGFQ 

5828 tgt'atggacttatcagttgcttatgtgtattacattactgacggtaaagttcgcatgtggggtaatgccaaagacgcgataaat 

29 CMDLSVAYVYYITDGKVRMWGNAKDAIN 

5 912 aacgactttaaaggtttagcgacggtgtataaaaatacaccgagctttaaacctcaattaggggacgttgctgtatatacaaat 

57 NDFKGLATVYKNTPSFKPQLGDVAVYTN 

5996 ggacaatatggacatattcaatgtgtgttaagtggaaatcttgattattatacatgcttagaacaaaactggttaggcggcggt 

85 GQYGHIQCVLSGNLDYYTCLEQNWLGGG 

6080 tttgacggttgggaaaaagcaaccattagaacacattattatgacggtgtaactcactttattagacctaaattttcaggtagt 

113 FDGWEKATI RTHYYDGVTHFIRPKFSGS 

6164 aatagcaaagcattagaaacatcaaaagtaaatacatttggaaaatggaaacgaaaccaatacggcacatattatagaaatgaa 

141 N S KALET S KVNT FG KtyKRNQYGTYYRNE 

624 8 aatggtacatttacatgtggttttctaccaatatttgcacgtgtcggtagtccaaaattatcagaacctaatggctattggttc 

169 NGTFTCGFLPIFARVGSPKLSEPNGYWF 

63 32 caaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggtcataactggcaaggcacacgt 

197 Q p NGYT PYN EVC LS DGY VW IGYNWQGT R 

6416 tattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcataa 64 96 

225 YYLPVRQWNGKTGNSYSVG I PWGVFS * 
4 4AHJDORF010 

14420 ttggttagacatacgtctgaaatggatagatggaaaaaagaaagagaagctagaaaagagcaagaaaaagatttatttttaaat 

1 LVRHTSEMDRWKKEREA RKEQEKDLFLN 

14336 gattttagtaatgttaattttaaatttgatgataaagaCttacaagaggcgtacattgacacatggaaacattttgcacatctg 

29 D FSNV NFKFD DKDLQEAYI DTWKHFAHL 

14 252 ccctattttcctaaagaaagaaacgtatcatatgtaaatgctgtatcattggtaagaggttcaagacataaaaaattaaattat 

57 pYFPKERNVSYVNAVSLVRGSRHKKLNY 

14168 attcttgaaatatataaccgtaatgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagct 

85 ILEIYNRNDDSNNKNAKKHKYALYNLQA 

14084 aaaaataataattcttcaatgtataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtg 

113 KNNN SSMYKY I KE I DTLYKE IG K SDRPV 

14000 acaaatattgatgatgaagatgtgaggtataactttttatattatgcaacatttgacgaataa 13 93 8 

141 TNIODEDVRYNFLYYATFDE* 
4 4AHJDORF011 

15593 atgacaaacgtaaaagatattttatcaagacaccaaaacacattagcgagatttgaatttgaggaaaaagaaagagaatttatc 

1 MTNVKDILSRHQNTLARFEFEEKEREFI 

15509 aaactatcagaattagtagaaaaatacggtatgaaaaaagagtatatcgttagagcactattcacaaacaaagaatcaaaattc 

29 KLSE LVEKYGM KKEYIVRALFTNKESKF 

15425 ggtgaacaaggtgttatcgtcactgatgaccataacgtaaacttaccgaaccacttaacagaattaattaaagaaatgagagca 

57 GE QGVIVTDDYNVNLPNHLTEL I KEMR A 

15341 gatgaggacgttgttgacattatcaatgccggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggt 

85 D E DVVDI INAGEVQ FTI YEYEN KKGQK G 

15257 tactcaatcaattttggtcaagtatcattttaa 15225 

113 YSINFGQVSF* 
44AHJDORF012 

8391 atgaacgaagtaaaattcagacttacagactcagaagcgtttcacatgtttatatacgctggggatttaaaattactctacttt 

1 MNEVKFRFTDSEAFHMFIYAGDLKLLYF 

8475 ttatttgtattaatgttcgttgatattattacaggtatttcaaaagcaattaaaaataataacttatggtcaaaaaaatcaatg 

29 LFVLMFVDI ITGISKAI KNNNLWSKKSM 

8559 agaggattttctaaaaaattactgatattctgtattaccattttagcaaacaccatcgaccagattttacaattaaaaggtggt 

57 R.GFSKKLLX FCIII LANI XDQX-LQLKGG 

8643 ctactcatgattacaatatttcattatattgcaaatgagggactttctattgtagaaaattgtgcagaaatggacgtattagta 

85 LLMITIFYYIANEGLS IVENCAEMDVLV 

8727 ccagaacaaattaaagataaattaagagtcattaaaaatgatactgaaaagagtgataacaatgaacgatcaagagaagataga 

113 PEQI KDKLRVIKNDTEKSDNNERS REDR 

8811 taa 8813 * 

141 * 
44AHJDORF013 

14 996 atgaaaattaaaactacttttagattaaataatttaatttattaccttttaacaaatagagattattataatgataaatttgaa 

1 mkIKTTFRLNNLIYYLLTNRDYYNDKFE 

14 912 aaatttacttcatctaataaaaaatgtatagtaaaaataaatatgggtgatgtgtatattgagtttgacaaacaatatgatgat 

29 K FTS SNKKC I VK I NMGDVY I E FD KQYDD 

14828 tttgaaattgaaaaagagttatttacgttagatatcgacattgatattaaaaaacatgtttttaatatacttgcattttattat 

57 F E I EKELFTLDIDI DI KKHVFN I LVFYY 

14744 agaaattatttaagtaatgaattaataagagaaattttattaaacgttacaattgacgacgtattatcaaattttgataaaccc 

85 RNYLSNELIREI LLNVTIDDVLSNFDKP 

14660 cttgaaagcgaattaatgattatttatcaaaacaaagtcatatacgataatgggaaagtgattgaccatgaataa 14 586 

113 LE SELMX I YQNKVI YDNG KV IDH E * 
44AHJDORF113 

199 atgacagaatttgatgaaatcgtaaaaccagacgacaaagaagaaacttcagaatcaactgaagaaaatttagaatcaactgaa 

1 MTEFDEIVKPDDKEETSESTEE'NLESTE 

283 gaaacttcagaatcaactgaagaatcaactgaagaatcaactgaagaatcaactgaagataaaacagtagaaacaatcgaagaa 

29 etsESTEESTEESTEESTEDKTVETIEE 

367 gaaaatgaaaacaaattagaacctactacaacagatgaagatagttcgaaatttgaccctgttgcattagaacaacgcattgct 

57 emenkl.eptttdedsskfdpvvleqria 

451 tcattagaacaacaagtgactacttttttatcttcacaaatgcaacaaccacaacaagtacaacaaacacaatcagatgtaaca 

85 sleqqvttflssqmqqpqqvqqtq'sdvt 

535 gaatcaaacaaagaagataacgactattcagatgaagaactagttgataagttagatctagattag 600 
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113 ESNKEDMDYSDEELVDKLDLD* 

"^^YtMttaat gttgataatgcaccagaagaaaaaggacaagcctatactgaaatgttgcaactattcaataaactgattcaatgg 

! MVNV DNAPEEKGQAYTEMLQLFNKLIQW 

16088 aatccagcttatacatttgacaatgcaattaacttattatcggcttgccaacaactattactaaactataatagttctgttgtt 

29 NPAYTFDNAINLLSACQQLLLNYNSSVV 

16004 caattcttaaatgatgaactaaacaacgaaactaaaccagaatcaatattgtcttatattgctggtgatgacccaatagaacaa 

57 q flmdE LNMETKPESILSYIAGDDPIEQ 

15920 tggaatatgcataaaggattttatgaaacgtataacgtttacgttttttag 15870 

as WNMHKGFYETYNVYVP * 

^^^Y 014 aaatggtacacttacatgtggttttttaccaatatttgcacgtgtcggtagtccaaaattatcagaacctaatggctatt 

1 M KM VHLHVVFYQ YLHVSVVQNYQNLMAI 

6327 ggttccaaccaaacggttatacaccatataacgaagtttgtttatcagatggtcacgcatggattggttataactggcaaggca 

29 G S N Q T V I H R I T K F V Y Q M V T Y G L V I T 

6411 cacgtcattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcat 

57 HVIIYQCANGMEKQVIVTVLVF LG GCSH 

6495 aatgggtattttagcctttttctttga 6S21 

85 NGYFSLFL* 

1 V T ITPCS PHFDSLFVNNALT I Y S FF . 

15487 ttttctaccaattctgatagtttgataaattctctttctttttcctcaaattcaaatctcgctaatgtgttttggtgtcttgat 

29 fstnsDSLINSLSFSSNSNLANVFWCLD 

15571 aaaatatcttttacgtttgtcattttatttctcctcttatttaaattatttgctttctgcaattgcgatttgtag 15645 

S7 KISFTFVILFLLLFKLFAFCNCDL* 

1 M K VDDIVTLRVKGYI LHYLDD DN EY I E E 

15768 tttttaccacttcacgagtatcatttaaccaaaacacaagcaaaagaattattaccagacacatgtaaactattgtccactaca 

29 FLPLHEYHLTKTQAKELLPDTCK L L S T T 

15684 cgcacaacgaaaacaattcaagtttattacaatgatttactacaaatcgcaattgcagaaagcaaataa 15616 

5 7 RTTKTIQVYYNDLLQIAIAESK * 

ro^fTtgVaaagattaaaattgcttctgctggtataccgaaaaa 

1 M ERLKLLLLVYRKTPLIQ AS ILKPLYVN 

10673 aattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaatatcgatatatccgtctaaaactg 

29 M S LTVPLLKTIKVS IMS KVQYRY I-RLKL 

10589 aaattgcatgtggtaatgtatatgatgaatattttactgatgaacttaatatga 10536 

5 7 KLYV. VMYMMNILLMNLI * 

to^Ttg^caattggtactgtgtccataatcacgtattcttcactatatt 

1 M LlGTVSIITYSSLYCPIKSCSLANQli^ 

1014 cgattacctaatgcaatatcgattaataaagtctcattaatcttagggaataaatatttatttacaaatgtttcgaacattg^ 

29 R LPNAISINKVSLILGNKYLFTNVSNIV 

930 tttgaattatcccatttgtcgccaaatgtccaagattttgaataa 886 

57 FELSHLS PNVQDFE * 

98^Ttg\ 9 tacctggtttgtataagta t tcttttt 

1 MLPGLYKYSFL NK GTPIAFLYFSGNCAX 

9752 gtccagttaccaccaatcacacgaccactttttccatttggcttgactgatttaccactaattggtttatggtctccgtcatca 

29 vQLPPITRPLFPFGLTDLPLlGLW SPSb 

9668 tcagtaggattagaactactactcccactatctacttga 9630 

57 SVGLELLL-PLST* 

1 M ENETKN I ELKHVFRF KN G S LC I AL D R 

16278 acagaaaatgaaatttcattttatgatgttgacattgatgaaattgaagatttaaatcataattctgttttacgcgtaatttca 

2 9 TENEIS FYDV DIDEIEDLNHNSVLRVIS 

16194 actttattaggaagtgataataatggttaa 16165 
57TLLGSDNMG* 

1 MSKRFCFTMFLLLVIVYDVVYSVKFIKW 

13949 atgttgcataatataaaaagttatacctcacatcttcatca^ 

29 mlhNIKSYTSHLHHQYLS LVYLIYQFLY 

14033 ataaagtatcgatttctttaa 14053 
57IKYRFL* 

Tl^T^ 

1 MYEGNNMRSMMGTSYEDSRLNKRTE L N ft 

698 aacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatcatttacaggtgac 

29 NMSIDTNK SEDSYGVQIHSLSKQS FTOU 

782 gttgaggaggaataa 796 

57 V E E E * 
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44AHJDORF021 

5816 atgcaccatcaaagtcaacacctgccccctcatgcttatatatccattcttttgcttgttgttgtgatttcatttatatcactc 
1 MHHQSQHLPPHAYIS I LLLVVVI SFISL 

5732 ctatttttgatgttttgctacccaaccatattcacgatgttttgtttccgcattaacattactgaagaattcttcatattccga 
29 LFLMFCYPTIFTMFCFRINITEEFFIFR 
5648 tatattagcctctaa 5634 
57 Y I S L * 

44AHJDORF022 

8611 atgtttgctaaaatgacaatacagaatatcaataattttttagaaaatcctctcattgatttttttgaccataagttattattt 
1 MFAKMI IQNINNFLENPL IDFFDHKLLF 

8527 ttaattgcttttgaaatacctgtaataatatcaacgaacattaatacaaataaaaagtag 8468 
29 L I A F E I PVI I STNINT. NK K * 

44AHJDORF023 

6494 atgagaacaccccccaaggaataccaacactgtaactattacctgtttttccattccattggcgcactggtaaataataacgtg 
1 MRTPPKEYQHCNYYL. FPHSIOALVKMMV 

6410 tgccttgccagttataaccaatccatacgtaaccatctgataaacaaacttcgttatatggtgtataaccgtttggttggaacc 
29, CLASYNQSIRNHLINKLRYMVYNRLVGT 
6326 aatagccattag 6315 
57 N S H * 

44AHJDORF024 

14275 gtgtcaatgtacgcctcttgtaaatctttatcatcaaatttaaaattaacattactaaaatcatttaaaaataaatctttttct 
1 VSMYA SC KSLSSNLKLTLLKS FKNKSFS 

14359 tgctcttttctagcttctctttcttttttccatctatccatttcagacgtatgtctaaccaatgttatcaacctccatataaag 
29 CSFLASLSFFHLSISDVCLTNVINLHIK 

14443 cataaataa 14451 

57 H K * 
4 4AHJDORF025 

15175 atggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgtttgaagat 

1 MERKYKTVLLYCDE I KGH FPHQI SMFED 

15091 ttatatgacgctaaagttgtatattcatattatgaataCaacctgttcactaaaaaatacgcgtatatcatagaatacattaag 

29 LYDA KVVYSY YEYNLF TK KYAYI I EYIK 

15007 gagatataa 14999 

57 E I * 
44AHJDORF026 

14593 atgaataacctattaaacatagccattgttttccttttagcatttttaattacacttatcatacttatgacactgcatatacgc 

1 MNNLLNIA. IVFLLAFLIT LI I LMTLHIR 

14509 gtgtcatttggtgttttattcactacattgattatattctatattatctttttaatggttatttatgctttatatggaggttga 

14426 

29 VSFGVLFT TLIIFYI I FLMVIYALYGG* 

4 4AHJDORF027 

12916 atgattgtctatatccctaattttagtacaaaattcatattgttttgtatatggtacaacgataatatttgtcataaaagtagt 
1 MIVYI p NFSTKFILFCIWY NDNICHKSS 

13000 tacattatacatgactttaatatatttatcatcagttttgatatagaagaaatcaccgttttgattgatgtgatttcttaa 

13080 

29 Y IIHDFNIFII SFDIEEITVLIDVIS* 

4 4AHJDORF02 9 

15183 gtgtttaaatggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgt 
1 vFKWNVNTKRY YYiAMR LKDIFHI KSQC 

15099 ttgaagatttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatag 

15019 

29 LKIYMTLKLYIHIMNITCSLKNTRIS* 
4 4AHJDORF028 

9235 atggaatatatgcacgtccaattgtacctgctttcatattttttgcaaaatctgcattaccttttctttgtacgtcttgtggta 
1 MEYMHVQ LYLLSYFLQNLHYLFFVRLVV 

9151 caaagtggacgatgttacctgcgtcataccaagacggttgtccagcttgtcttgattgtgatactaactttcttgctatga 9071 
29 QSG.RCYLRHTKTVVQLVL IVI LTFLL* 

44AHJDORF030 

144 87 gtgaataaaacaccaaatgacacgcgtatatgcagtgtcataagtatgataagtgtaattaaaaatgctaaaaggaaaacaacg 
1 VNKTPNDTRICSVI S M I S VI KNAKRKTM 

14571 gctatgtttaataggttattcatggtcaatcactttcccattaccgtatatgactttgttttgataaataatcattaa 14648 
29 AMFNRLFMVNHFPI IVYD FVLINNH* 

44AHJDORF031 

1103 9 atgatattgtatagttcattgttatcatccaaacggaataagttaaaatgtgaacgtaatgcaggtatgccatataatccaccc 
1 MILYSSLLSSKRNKLKCERNAGMPYNPF 
11123 aaaacgactttagataacataacctcctcatttgagtatgggtgttcgttgatatcatcagtaatgtga 11191 
29 KTTLDNITSSFEYGCSLI SSVM* 

44AHJDORF135 „ a „ 
693 atgaaaacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatcatttacag 

1 MKTCQLIQIKVKIVMVYKFIHFQNNHLQ 

777 gtgacgttgaggaggaataataaattatggcacaacaatctacaaaaaatgaaactgcacttttag 842 

29 VTLRRN N KLW HNNLQKMKLHF* 

44AHJDORF033 . _ a „ 

3795 atgccattatttaaccacctctaccaaatttgtaaaaaacattttttatcaaattcatttaaaattttctttcttaaatcgcac 

1 MPLFNHLYQI CKKHFL S NS FK I FF LKS Y 
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3711 gctttatcaatattatcaattaaatactgcttagtgaattgtgtaccttttgcattacctttttga 3646 
29 ALS ILSIKYCLVNCVPFALPF* 

4 4AHJDORF032 

94 55 acggcttgttttgctaaagcgagtagtgaactaccactgtcaccactactaccactgtcagacgaatcactaggtgatccacct 
I MACFAKASSELPLS PLLPLSDESLGDPP 

9371 ttaccgtctaatttaccaccccaagctagaatagtattcgcaccgtctaaaaatggattaccatag 9306 
29 LPSNLPPQAR.IVFAPSKNGLP* 
44AKJDORF034 

14146 atgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagctaaaaataataattcttcaatgc 
x MMI L I I KTLKS INTLY I I YKLKI I I LQC 

14062 ataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtga 14000 
29 INILKKSILYIKKLVJTQIDQ* 
44AHJDORF03 5 

13957 atgcaacatttgacgaataaatttaacactgtaaacgacatcataaactattacaaggagcaaaaacatggtaaaacaaaatcg 
1 MQHLTNKFNTVNDI INYY KEQKHGKTKS 

13 873 tttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataa 13 811 
29 FRHGKRLS KCCQSCQKKNPR* 

4 4AHJDORF036 

10165 gtgtacacaataccacacgtgatggtgcaacatatggtggtacattatagtttgcaactaaaaacgaaccatcttcaaaaactg 
1 vYTIPHVMVQHMVVHYSLQLKTNHLQKL 
10081 ctacaacaacacctgtgtgaccaataccatatgcagttgcttgtaagtatggtggtttactag 10019 
29 LQQH'LCDQYHMQLLVSMVVY*" 
44AHJDORF037 

14788 atgtcgatatctaacgtaaataactctttttcaatttcaaaatcatcatattgtttgtcaaactcaatatacacatcacccata 
1 MSIS NVNNSFSISKSSYCLSNSIYTSPI 

14872 tttatttttactatacattttttattagatgaagtaaatttttcaaatttatcattataa 14931 
29 FIFTIHFLLDEVNFSWLSL * 

44AHJDORF038 

3671 gtgtaccttttgcattacctttttgattttgattacgttttgcgttttgattaccttcgttactcgatttattcacagttttac 
1 VYLLHYLFDFDYVLRFDYFRY S IYSQFY 

3587 cgttatcaatcgtattattatcagcgaatcgtaacgttgtattatcaacatcaatgttaa 3528 
29 R YQSYYYQRIVTLYYQHQC * 

44AHJDORF039 fc _ . . 

1743 gtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaaccaaaatcagatttatatg 

1 VLYLLMMYLNLKSLLATLKK L.N Q N Q I Y M 

1827 cgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaa 1883 

29 RLFWI LIQ LNINVTQKVC * 

44AHJDORF040 

974 0 gtggtaactggacatatgcacagttaccagaaaaatataaaaaagcaattggtgtacctttattcaaaaaagaatactcacaca 

1 VVTGHMHSYQKNI KKQLVYLYS KKNTYT 

9824 aaccaggtaacatatttcctcaaacgggtaatgcaggacaatgtacagaattaa 9877 

29 NQV TYF LKR .VMQDNVQN * 
44AHJDORF041 

15836 atgtcgtcaactttcattattatatcactcctttctaaaaaacgtaaacgttatacgtttcataaaatcctttatgcacacccc 

1 MSSTFI I ISLLSKKRKRYTFHKILYAYS 

15920 attgttctattgggtcatcaccagcaatataagacaatattgattctggtttag 15973 

29 IVLLGHHQQYKTI L I L V * 



5151 atgcacgaccgtcgtcttttgttaatctatagttttgtgaacctcttgcgcgtaatgcttcaaagtgttcatactcaccaagtt 
1 MHDRRLLLIYS FVNLLRVM LQSVHTHQV 

5067 ggaagaaaccatataaattatggaaacgttttccaccaccgccgtttgtcatag 5Q14 
29 GRNHINYGNVFHHRRLS * 

44AHJDORF043 

4539 atgcgacttgtaacagttttgcaacaccatcgtgatgtaaccagattttcatttcaccattggattgacgttctaatccgattg 
1 M RLVTVLQHHRDVTRFSFHHWIDVLIRL 

4455 ttgtaccatgaccaccctgtacaatacgcatgcttgaaattaagtcaccactag 4402 
29 LYHDHPVQYACLKLSHH * 

44AHJDORF044 ^ + M „ m „' m 

12917 atgttacctatttacgtgatgatatgttttataaagaaaacatggaacgttattactacaatccaagcaatttacattttgaca 

1 ML P I .YVMI CF I KKTWNVI TTI QAIY'ILT 

1283 3 atgcttactctaaaaattacgtggttgataatgatagatatttatattcag 12783 
29 MLTLKITWLIMIDIYI * 

77^^YtVa 9 ttgttttgaaagtgaatgaatttgtacaccataactatcttcacttttatttgtatc 

1 mivLKVNEFVHHNYLH FYLYQLTCFHLI 

686 ctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatag 63 9 
29 LFVYL I LNLHMMYPS * 

44AHJDORF046 b „^^ aH . ( . 

4891 atgattatccatttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggtactcgcaccc 

I miihLSYHIKTVLISHVITLKSLRVFAF 
4975 atacaaatccaaaaacaaaacgtaaatcgttattacttgccatga 5019 
29 I QIQKQ NVNRYYLL* 

44AHJDORF047 t - M . raha , ral . ( . 

11911 atgaatgtatgtaagttgctcaggtgtgagttttgcaaaacatttcacagcacagtcataggcttcactatcattcacaccacc 

1 MNVCKLFRCEFCKTFHSIVIGFTIIHII 
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11995 atccttatcaaaaatcgtataattaaaatctgttttaagetgtga 12039 

10739 gcaattttaatctttccattcacttcatatgcatatttcttatga 10783 

15256 actcaatcaattttggtcaagtatcattttaatacaatttcatag 15212 

5868 acggtaaagttcgcatgtggggtaatgctaaagacgcgataa 5909 

13242 gcttttgtaataaattgtatatatttaaattggataatatag 13283 

10982 gttcattgtataacttattggttcctttccaatacttaa 10944 

14254 tgccctattttcctaaagaaagaaacgtatcatatgtaa 14216 

3432 ctgatgaaacatcgaatcaaaatgctacatctttag 3467 

7635 ttggttatcataatgaagttcgagtatatccagtag 7670 
29 LVIIMKFEYIQ * 

15789 catcatctaagtaatgaagtatataacctttga 15821 

X V S I T L Q V T K W N Y L B T R Q K K L K K w 

5596 tgtcaagtggtaacgcagtcggtgaagtaa 5625 

10205 tattctttagtggtattgcttaattaa 10231 

10851 ataaactggggttcaataagggtttaa 10877 
29 INWGSIRV* 



618 tacatgtttaaattcctcctaatctaa 592 
8276 ttaacatggggtcgatgtcacgtatag 8250 
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6173 ctttgctattactacctgaaaatttag 6147 

29 LCYYYLKI * 

1 MC.FGVLI KYLLRLSFYF SSYLNYLLSAI 

15635 gcgatttgtagtaaatcattgtaa 15658 

29 AICSKSL* 
4 4AHJDORF062 
4 



4285 gtggtattcgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaa 
1 V VFATQLTN LLILIKKQ ITCTLHNPILK 

43 69 aacctgaaggtttttggataa 438 9 
29 NLKVFG* 
4^JD°R^ 

! mrl .VFFLIILAWLVLLKRVVN YHCHHYY 

9403 cactgtcagacgaatcactag 9383 
29 HCQTN H* 

50^ O Ttggtggaaaacgttt^ 

1 VVENVSI IYMVSSNLVSMNTL KHYAQEV 

5113 cacaaaactataaattaa 5130 
29 H K T I N * 

44^JDORF^ 

1 M TSQSINLCPKYIT VHHLLKCHLC LMQM 



2693 acgatatcattgatttaa 2710 

29 T I S L I * 

1 

10397 ctgatttgcatatattaa 10380 

29 L I C I Y * 



M~V F F Il'k V TS VHFHLTTYFQLNVQYITN 
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Table 19 

Sequence similarities between ORFs 44AHJD and public databases 



Phage: 44AHJD 
Database: nr 



Query* sid| 110871 | lan|44AHJDORF001 Phage 44AHJD 0RF| 10*342-12627 1 -1 
(761 letters) 

gi 1 118848 1 sp| P19894 | DPOL.BPM2 ^ DNA POLYMERASE >gi | 76 896 j | JQO . 
gi|l0726S6|pir" ------ ' 

gi| 1429230 jemb 



,SS1275 DNA polymerase - phage CP-1 >gij 836593 | e . 
CAA67649| (X99260) DNA polymerase (Bacteriophage. 
CAA65712| (X96987) DNA polymerase [Bacteriophage... 46 0.001 



55 le-06 
53 6e-06 
49 le-04 



44 0.004 
44 0.004 
41 0.041 



112 7e-24 
52 le-OS 
39 0.10 



42 0.010 



gi 1 118851 1 ip | P06950 | DPOL BPPZA DNA POLYMERASE (EARLY PROTEIN GP. 45 0.002 

gi 2435429 (AF0122S0) unassigned reading frame (possible DNA po. . . 45 0.002 
gi l084487|pir||S41618 DNA polymerase - slime mold (Physarum po. . . 45 0.002 
qi 4877819|gb|AAD3l446.l| (AF13350S) DNA polymerase (Neuroapora... 
qi 461962 |sp|P33537| DPOMJJEUCR PROBABLE DNA POLYMERASE >gi|2833... 
qi 2499Sll|sp|Q1247l|6P22 YEAST 6 -PHOSPHOFRUCTO-2 -KINASE 2 (PHO... 
li 2258375 gb| AAD11909 . 1 1 "(AF007261) transcription initiation f .. . 40 0.070 
91115734^^1^374501 (XS3370) DNA polymerase (AA 1-575) (Bact... 39 0.092 

Query= sid| 110872 | lan| 44AHJDORF002 Phage 44AHJD ORF| 3789-5732 | 3 
(647 letters) 

gi|l3S273|sp|P27622|TAGC_BACSO TEICHOIC ACID BIOSYNTHESIS PROTE. . 
qi 1 142847 (M640S0) DNase inhibitor [Bacillus subtilis] 
gi|4038407 (AF103943) factor C protein precursor (Streptomyces .. 

Query= sid|110873|lan|44AHJDORF003 Phage 44AHJD ORF|6626-8389|2 
(587 letters) 

gi|l38123|sp|P0433l|VG9 BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >.-• 92 8e-18 

gi 138124 S?P07534 VG9IBPPZA TAIL PROTEIN (LATE PROTEIN GP9) >... 82 le-14 

gi 1429238|eib|CAA67657T (X99260) tail protein (Bacteriophage B .. , 78 2e-13 

gi 215339 (M124S6) p9 tail protein [Bacteriophage phi-29 >gi|2... 71 

li 1181968 |emb|CAA87738.l| (247794) tail protein (Bacteriophage .. . ,4 3e0fi 
gi|1181970|emb|CAA87740.l| (Z47794) tail protein (Bacteriophage. 

Query* aid| 110875 |lan|44AHJDORF005 Phage 44AHJD ORF 1 12643 -138 90 | -1 
(415 letters) 

gi|384S203 (AE001399) GAF domain protein (cyclic nt signal tran. . . 52 6e-06 
ii 3758843|emb|CAB11128.l| (Z98S51) predicted using hexExon MA^. 49 Se OS 
gi 3845297 (AE001421) hypothetical protein [Plasmodium falciparum] 4d le 3. 
gi 4493936lemb|CAB38972.l| (AL034S56) predicted using hexExon; ... 
gi|384S16S (AE001390) hypothetical protein [Plasmodium falciparum] 

Query* sid| 110877 1 lan| 44AHJDORF007 Phage 44AHJD ORF| 2044-3027 1 1 
(327 letters) 

giill81960|emb|CAA87731.l| (Z47794) connector protein [Bacterio. . . 46 Se-04 
qi 1429239 emb CAA676S8 1 (X99260) upper collar protein (Bacten .. . - 8e 04 
qi 13791S|sp|P07S3S|VG10 BPPZA UPPER COLLAR PROTEIN (CONNECTOR . 
gl 1 137914 j sp| P04332 | VG10~BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 

Query* sid| 110878 1 lan| 44AHJDORF008 Phage 44AHJD ORF| 3020-3775 | 2 
(251 letters) 

gi|4982468|gb|AAD30963.2| (AF118151) SNFl/AMP-activated kinase ... S2 3e-06 

giil730077|Ipipi8160|KYKl T DICDI "-^^J™™™ Z'.V. " lilt 

) >... 46 3e-04 

ai I 585795 I sp I vmso i ke.oj._iww* ---- - n^-ru 

li 172372 (MS8728) DNA-binding protein (Saccharomyces cerevisiae] 4* 3e 04 



47 2e-04 
46 6e-04 



44 0.002 
41 0.009 



1 / jgu / / 3D rioiou i mu*^*^- 

gi 37S88S5lemblCAB11140.il (298SS1) predicted using hexExon; 

58579S|sp|P21538|REBl_YEAST DNA-BINDING PROTEIN REB1 (QBP) >... 46 3e 04 
gi 172372 (MS8728) DNA-binding protein (Saccharomyces cerevis. 

li 29S2S4S (AF05189B) coronin binding protein (Dictyostelium di... 45 6e 04 

li 53S260|emb|CAA82996| (Z30339) STARP antigen [Plasmodium reic... 45 7e 04 

9 i 1429240|e«b|CAA676S9| (X99260) lower collar protein (Bacten .. . 44 0.001 
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Query* aid| 110879 | lan | 44AHJDORF009 Phage 44AHJD ORF| 5744-6496 | 2 
{250 letters) 



276498l|emb|CAA69021.l| (Y07739) N-acetylmuramoyl-L- alanine ... 
U3675|sp|P24S56|ALYS_STAAU AUTOLYSIN (N-ACETYLMURAMOYL-L-AL . . . 
1763243 (U72397) amidase [bacteriophage 80 alpha] 
4574237|gb|AAD23962.l|AF106851_l (AF106851) LytN (Staphyloco . .. • 
3767593|dbj |BAA33856.l| (AB01519S) LytN (Staphylococcus aureus] 
2764983|emb|CAA69022.l| (Y07740) cell wall hydrolase Plyl87 ... 
3287732 | sp| 005156 | ALE 1_ST AC P GLYCYL -GLYCINE ENDO PEPTIDASE AL... 
79926|pir| |A25881 lysostaphin precursor - Staphylococcus aim... 
126496|sp|P10548|LSTP_STAST LYSOSTAPHIN PRECURSOR ( GLYCYL - GL. . . 
3287967 | 9p| P10547 | LSTP_STASI LYSOSTAPHIN PRECURSOR (GLYCYL-G... 
3341932|dbj |BAA31898.1| (AB009866) amidase (peptidoglycan hy. . . 



gi 
gi.l 
gi| 
gi| 
gi| 
gi| 
git 
gil 
gil 
gil 

Query* sid| 110882 |lan| 44AHJDORF012 Phage 44AHJD ORF | 8391-8813 | 3 
(140 letters) 

gi 140528|sp|P2481l|YQXH_BACSU HYPOTHETICAL 15.7 KD PROTEIN IN . 
gi 412663l|dbj|BAA36651.l| (AB016282) ORF45 [bacteriophage phi- . 
gi 141088|sp|P26835|YNGD_CLOPE HYPOTHETICAL 14 . 9 KD PROTEIN IN . 
gi 2293160 (AF008220) YtkC [Bacillus subtilis] >gi| 2635548 1 emb| . 
gi 1181973|emb|CAA87743.l| (Z47794) holin protein [Bacteriophage 



180 
118 
118 
84 
84 
77 
73 
69 
69 
69 
63 



60 
76 
61 
36 
31 



le-44 
6e-26 
6e-26 
9e-16 
9e-16 
2e-13 
2e-12 
3e-ll 
3e-ll 
3e-ll 
6e-ll 



6e-lS 
le-13 
4e-09 
0.099 
3.3 
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Table 20 

Homolgies between phage 44 AHJD ORFs and proteins in public databases 

Query- pt | 110871 44AHJDORF001 Phage 4 4 AHJD ORF 1 10342 -12627 | -1 1 
(761 letters) 

>gi | 118 84 8 | sp| P198 94 | DPOL_BPM2 DNA POLYMERASE >gi | 768*6 | pir | | JQ0161 
DNA-directed DNA polymerase (EC 2.7.7.7) - phage M2 
>gi|215509 (M33144) DNA polymerase (Bacteriophage M2J 
Length =572 

Score = 55.4 bits (131), Expect = le-06 

Identities = 96/426 (22%), Positives = 159/426 (36%), Gaps ■- 88/426 (20%) 

Query: 22 9 KLTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYNKIiTFSLNIMESYLNNEMTR FQ 283 

+VTPE + YI ND+ 1+ DI +++T + ++ + + T+ F 

Sbjct: 154 EITPEEYEYIKNDIEIIARA LDIQFKQGLDRMTAGSDSLKGFKDILSTKKFNKVFP 209 

Query: 284 LLNQYQDIKISYTHYHFHDMNFYDYIKSFYRGGLNKWTKYINKIilDEPCFSIDINSSYP 343 

L+ D + 1 + YRGG N KY K I E D+NS YP 

Sbjct: 210 KLSLPMDKEI RKAYRGGFTWLNDKYKEKE IGEGMV- FDVNSLYP 252 

Query: 344 YV>miEKIPTWLYFYEHYSEPTLIPTFIi)DDNYFSLYKIDKDVFNDDLLIKIKSRVLRQM 403 

MY >P YP+ +D+LYI+F+L K + + 
Sbjct: 253 SQMYSRPLP- YGAPIVFQGKYEKDEQYPLY- IQRIRFEFEL KEGYIPTI 299 

Query 404 XXXXXXXXXXXXXXXXXXLRMIQOITGIDCMHIRWSFVIYECEYFHARDIIFQNYFIK 462 

+ ++ +T +D 1+ + + +Y EY F + 

Sbjct: 300 QIKKNPFFKGNEYLKNSGVEPVELYLTNVDLELIQEH-YELYNVEYIDGFK FRE 352 

Query: 463 TQGKLKNKI NMTS P YD YH I TDD I NEH P YSNEEVMLS KWXNGL YG -----IPAL 511 

G K+ 1+ + H + L+K++LN LYG +P L 

Sbjct: 353 KTGLFKDFIDKWTYVKTH EEGAKKQLAKLMLNSLYGKFASNPDVTGKVPYL 403 

Query: 512 RSHFNL-FRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 

+ +L FR+ D YK+ + F+T+ + + + Q D 
Sbjct: 404 KDDGSLGFRVGDEE YKD PVYT PM - GVF ITAWARFTT ITAAQACY DRI 449 

Query: 571 IYCDTDSLYMKSVVTPLLNPSLFDPIAIX3KWDIENEQIDKMFVUIHKK YAYEVNG 625 

IYCDTDS+++ P + +. DP LG W E+ + L K Y EV+G 

Sbjct: 450 I YCDTDS I HLTGTEV P E 1 1 KD I VD P KKLG YWAHE S - T FKRAJCYLRQ KTY I QD I YVKEVDG 508 

Query: 626 KIKIAS 631 

K+K S 
Sbjct: 509 KLKECS 514 

>gi|1072656|pir| |SS1275 DNA polymerase - phage CP-1 

- >gi|836S93|emb|CAA87725.l| (Z47794) DNA polymerase 
(Bacteriophage CP-1] 
Length = 568 

Score = 53.5 bits (126), Expect » 6e-06 

Identities = 104/464 (22%), Positives = 169/464 (36%), Gaps * 66/464 (14%) 

Query: 230 LT?EQLTYIHNDVIIL- -GMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLNQ 287 

♦ PE + YIH DV IL G+ ++Y + F Y + +L + +F+ 
Sbjct: 152 IKPEWIDYIHVDVAILARGIFAMYYEENFTK- -YTSASEAI»TEFKRIFRKSKRKFRDFFP 209 

Query: 288 YQDIKISYTHYHFHDMNFYDYIKSFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVW 347 

D k+ D+ ♦ G + K+ + ♦++ DINS YP M 

Sbjct: 210 ILDEKVD DFCRKHIVGAGRLPTLKHRGRTLNQLIDIYDINSMYPATML 257 

Query: 348 HEKIPTWLYFYEHYSEPTLIPTFU)DDNYFSLY-KIDKDVFNDDL-LIKIKSRVLRQMXX 405 

+ p + + Y P > +D+Y+ + K D D+ L I+IK + + 

Sbjct: 258 QNALPIGIP - -KRYKGK- - - PKEIKEDHYYIYHIKADFDLKRGYLPTIQIKKXLDALRIG 312 

Query- 406 XXXXXXXXXXXXXXXXUWIQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIKTQG 465 

L + + H + E F +F +Y 
Sbjct: 313 VTITSDYVTTSKNEVIDLYLTNFDLDLFLKHYDATIMYVETLE-FQTESDLFDDYI 366 
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Query- 466 KLKNKI^SPYDYHITDDINEHPYSMEEVMLSKVVLNGLYGIPALR--SHFNLFRLDDN 523 

+ Y Y E+SE +K++LN LYG + S L LDD 

Sbjct: 367 TTYRYK KENAQSPAEKQKAKIMLNSLYGKFGAKIISVKKIAYLDDK 412 

Ouerv 524 NELYNt INGYKNTERNIL F ST FVT S RS L YNLL VP FQ YLT ES E I D DN F I YCDTDS 577 

W Y ' L + + + FVTS + + ++ Q E DNF +Y DTDS 

Sbjct: 413 GILR FKNDDEEEVQPVYAPVALFVTSIARHFIISNAQ ENYDNFLYADTDS 462 

Ouerv 578 LYMKSWKPLI^PSLFDPIAIXJK^ 637 . 

L + + + L+ DP GKW E + K L K Y, E+ + + K 

Sbjct: 463 LHLFHSDSLVLD IDPSEFGKWAHEGRAV-KAKYLRSKLYIEELIQEDGTTHLDV-KG 517 

Query: 638 AFDTSVDFETFVREQFFDGAI IENNKSIYNEQGTISIYPSKTEI 681 

A T E E F GA E +G IY + +1 

Sbjct: 518 AGMTPEIKEKITFENFVIGATFEGKRASKQIKGGTLIYETTFKI 561 

>gi|l429230|emb|CAA67649| (X99260) DNA polymerase [Bacteriophage 
8103] 

Length =572 r 
Score = 49.2 bits (115), Expect = le-04 

Identities = 93/422 (22%), Positives * 155/422 (36%), Gaps ~ 88/422 (20%) 

Query 229 KLTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTR FQ 2B3 

♦+TPE+ YI ND+ 1+ DI + ++ ♦ + T+ F „ Aft 

Sbjct: 154 EITPEEYEYIKNDIEIIARA LDIQFKQGLDRMTAGSDSLKGFKDILSTKKFNKVFP 209 

Ouerv 284 LLNQYQDIKISYTHYHFHDWFYDYI^ 343 
Query. L+ ^ +I + YRGG N KY K I E D + NS YP 

Sbjct: 210 KLSLPMDKEI - RRAYRGGFTWLNDKYKEKE IGEGMV - FDVNS LYP 252 

Ouerv 344 YVMYH E KI PTWL Y F YEHYS E PT L I PT F LDDDNYF S L YK I D KD VFNDD LL I K I KS RVliRQM 403 
W/ " MY +P Y P ♦ +D+LYI+F+L K + + 

Sbjct: 253 SQMYSRPLP YGAPIVFQGKYEKDEQYPLY-IQRIRFEFEL KEGYIPTI 299 

4 04 XXXXXXXXXXXXXXX^ 462 

++ +T +D 1+ + + +Y EY F + 

Sbjct: 300 Q I KKNP FFKGNE YLKNSGAE PVELYLTNVDLELIQEH - YEMYNVE YIDG FK FRE 352 

Ouerv- 463 TQGKLKKKINMTS PYDYHITDD INEHPYSNEEVMLSKWLNGLYG IPAL 511 

G K 1+ + H + + LYG +P L 

Sbjct: 353 KTGLFKEFIDKWTYVKTH EKGAKKQLAKLMFDSLYGKFASNPDVTGKVPYL 403 

Query- 512 RSHFNL- FRLDDNNELYNI INGYKKTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 

+ +L FR+ D YK-f + FVT+ + + + Q D 
Sbjct: 404 KEDGSLGFRVGDEE YKD P VYT P M - G V F I T AWAR FTT I T AAQAC Y DRI 449 

Ouerv- 571 IYCDTDSLYMKSvVKPLLNPSLFDPIAI/SKWDIENEQIDKMFVXiWKK YAYEVNG 625 

IYCDTDS+++ p + + DP LG W E+ * L K YA EV+G 

Sbjct: 450 IYCDTDSIHLTGTEVPEI IKDIVDPKKLGYWAHES -TFKRAKYLRQKTYIQDIYAKEVDG 508 

Query: 626 KI 627 
K+ 

Sbjct: 509 KL 510 

>gi|l572479|etnb|CAA65712| (X96987) DNA polymerase [Bacteriophage 
GA-1] 

Length =578 

Score = 46.1 bits (107), Expect = 0.001 maii 
Identities = 80/376 (21%), Positives = 146/376 (38%), Gaps = 54/376 (14%) 

Ouerv 234 QLTYIHtTOVIILGMCHIHYSDIFPNFDYNKLTFSI^IMESYI^EMTRFQLUIQYQDIKI 293 

+ + Y+ +D++I+ ♦ +F N D+ +T ♦ + +Y EM + +Y + 

Sbjct: 162 EIEYLKHDLLIVAIA LRSMFDN-DFTSMTVGSDALNnTY- -KEMLGVKQWEKYFPVL- 214 

Ouerv 294 SYTHYHFHDMNFYDYIKSFYRGGWmYOTKYlNKLIDE^^^^ 353 
yuery " + I+ y+GG N KY + + . D+NS YP +M ++ +P 

sbjct . 215 SLKVTJS.EIRKAYKGGFTWVTJPKYQ£ETVYGGMV- FDVNSMYPAMMKNKLLP - 264 

Query- 354 WLYFYEHYSEPTLIPTFLDDDNYFSLYKIDKDVFNDDLLIKIKSRVTiRQMXXXXXXXXXX 413 
Y EP + + + LY F + KI ♦> 
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sbjcC . 26S ygepVMFKGEYKKNVEYPLYIQQVRCFFELKKDKIPCIQIKGNARFGQMEYLS 317 

Query 414 XXXXXXXXLRMIQD ITGI DCMHI RVNS FVI YECE YFHARDI I FQNYF I KTQGKLKMKINM 473 
Sbjct: 318 TSGDEYVDLY VTNVDWELIKKH- YDIFEEEFIGG- -FMFKGF IGF .35 9 

Query 474 TSPYDYHITDDINEHPYSNEEVMLSKWLNGLYGI PALRSHFN- • LFRLDDNNELYNI IN 531 

Query. + N S E+ + +K++LN LYG A + LD+M L 

Sbjct: 360 FDEYIDRFMEIKKSPDSSAEQSLQAKLMLNSLYGKFATNPDITGKVPYLDENGVLKFRKG 419 

Query: S3, ». 
Sbjct: 420 ELKr -ERDPVYTPMGCFITAYARENILSNAQKLYP RPI YADTDS IHVEGLGEVDA 472 

Query: S89 NPSLFOPIALGKWDIE 604 

+ DP LG WD E 
Sbjct: 473 IKDVIDPKKLGYWDHE 488 

> g i|1188Sl|sp|P069S0|DPOL_BPPZA DMA POLYMERASE (EARLY PROTEIN GP2) 
>qi 1 7SB12 pir | | ERBP2Z DNA-directed DKA polymerase (EC 
^'7 7) - phage PZA >gi| 216051 (M11813) gene 2 product 

(Bacteriophage PZA] >gi|22474l|prf | |1112171E ORF 2 

(Bacteriophage PZA] 
Length » 572 

Score = 45.3 bits (105). Expect - °- 002 _ 110/4S1 {23% ) 

Identities - 98/461 (21%). Positives = 166/461 (35%), Gaps nu/ 

Query: 198 QLKTDFNYTIFDKDOTMNDSEAYDYAVXCFAKLTPEQLTY^HOT 

Sbj ct: 129 KIAKDFKLTVLKGDIDYHKERPVGY EITPDEYAYIKMDIQIIAEALL IQF 178 

Query- 258 NFDYNKLTFSLNIMESYLNNEMTR FQLLNQYQDIKISYTHYHFHDMNFYDYIKSF 312 

^ +++ T + ++ ♦ ♦ T + F L+ D ♦♦ Y * 

Sbjct: 179 KQGLDRMTAGSDDLKGFKDIITTKKFKKVFPTLSLGLDKEVRYA 

Query: 313 YRGGLNMYNTKYINl^IDEPCFSIDINSSYPYVMW *' 370 

Sbjct: 223 TOGGFTWLMDRFKEKEIGEGMV-FDVNSLYPAQMYSRLLP YGEPIVFEGKYV 273 

Query 371 LDDDNYFSLYKID KDVF N DDLLIKIKSRVLRQ f OCXXXX X XX X XXXXXXX X XLRMI 425 

D+D +1 K+ + +• IK +SR + 
Sbjct: 274 WDEDYPLHIQHIRCEFELKEGYIPTIQIK-RSRFYKGNEYLKSSGGEIADLW 324 

Query: 426 QDITGIDCMIRVTISFVIYBCEYFHARD^ 485 
Sbjct: 325 -^SnCd-LEL^HYTJLYMvSiSGLK FKATTGLFKDFIDKWTHIKTTSEGAI 375 

Query: 486 NEHPYSNEEVMLSKWLNGLYG iPALRSHFKL-FRU.DNNELYNlINGY 533 

. + L+K++LN LYG +P L+ + L FRL 
Sbjct: 376 KQ LAKLMLNSLYGKFASNPDVTGKVPYLKENGALGFRL 415 

Query: 534 KHTERKIL- -FS.FV^ ^TT^'T^S^^' ^ 
Sbjct: 416 EET103PVYTPMGVFITAWARYTTITAAQACF DRIIYCDTDSIHLTGTEIPDVIKD 470 

Ouerv- 592 LFDPIALGKWDIENEQIDKMFVLMHKKYAY EWGKI 627 

* ^' + DP LG W E+ * V KY EV * GK+ Cln 

Sbjct: 471 IVDPKKLGYWAHES-TFKRAKYLRQKTYIQDIYMKEVDGICL 510 

>qi I 2435429 (AF0122S0) unassigned reading frame (possible DNA 
polymerase) (Physarum polycephalum] 
Length =54 4 

Siti^ ?^/5^° 4 (ii%) XP p:sUive°s 02 = 206/545 (37%). Gaps • 104/545 (19%, 
Query- 179 TS I ATLGKXLLDGGYLTESQLKTDFNYTI FDKDtTOMNDSEAYDYAVKCFAKLTPEQLTYI 238 

Sbjct: 62 TQLFKLLKs'lQDSS^FKQ --FTYQNIM YSLEISCF- -LYPKKKILI 105 

Query: 239 HND VI I LGMCH I HYS D I F PNFD YN++" -TFS^IMESY-LMMEMTRFQLLNQYQD 290 

Sbjct: 106 -KDLYNFFSENIIYNDVVKDYKLLAILYNEIQTAYNININRKYILSTASLSLRI FKKSFP 164 
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Query: 291 IKISYTHYHFHD^^NFYDYIKSFYRGGLNMYl^^KYINKLIDEPCFSIDINSSYPYV^fYHEK 350 

K ♦ D + +YI + Y GG N I + + ♦ + D+NS YPY+M EK 

Sbjct: 165 EKYRLI PHLTRDED- -NYIRKSYIGGRNE I FEHVAQRNYFYDVNSLYPYIMKKEK 217 

Query: 351 IPTWLYFYEHYSEPTLIPTFLDD-DNYFS LYKIDKDVFNDDLL IKIKSRVLRQ 402 

♦ P + Y + + F + +N+F L I+K N +L + IK+ V 

Sbjct: 218 MPIGI PEYRDKEYMKKFEKNIENFFGFIDVLITIEKTNNNIPVLPYRMGIKNNV-EV 273 

Query: 403 MXXXXXXXXXXXXXXXXXXLRMIQD I TG I DCMH I RVNS FVI YECEYFHARD 1 1 FQNYFI K 462 

L + Q 1+ IY + ++++F+ Y + 

Sbjct: 274 GIIYAKGTLRGIYFSEEIKLALKQGYKIIE IYSAYEYKEKEWFEEYVEQ 323 

Query: 463 TQGK-LKNKINHTSPYDYHITDDINEHPYSNEEVMLSKVVLNGLYG -IPALRS 513 

+ LK K D + D L K +LN LYG I + 

Sbjct: 324 MYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQIDI ISP 363 

Query: 514 HFNLFRLDDNNELYNI INGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC 573 

L+DN++ ++ N ++ ♦ ++ + FYT ++IY 
Sbjct: 364 EKEL- - ITD^^^YISHDTTEFIDITA^^^CYNNIAITSAITSYARIF^fYNTIL^^n l ^LHVIYI 421 

Query: 574 DTDSLYMKSWKPLI^PSLFDPIALGKWDIENEQIDKMFVI^KKYAY-EVNGKIKIASA 632 

DTD L++K+ P+ + +L +GK+ +E+ + F+ N K Y Y +N I 

Sbjct: 422 DTDGLFLKN PI PDIALTTSKEMGKFRLESINAEAHFIAN-KFYIYAPINSPI IYKFK 477 

Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAI I ENNKS I YNEQGT ISIYPSK 678 

GIP ND ++ +F +1 NN Y* Q ♦ I Y + 

Sbjct: 478 GIPljQKPIFNIHDIITQHKKILNITLGHHYFTFSIRLNNNQTYSFQASRKRXLIPtTYK^ 537 

Query: 679 TEIVC 683 
I+C 

Sbjct: 538 PWIIC 542 

>gi| 1084487|pir| | S41618 DNA polymerase - slime mold (Physarum 

polycephalum) >gi | 509721 | dbj | BAA06121 . 1 | (D29637) DNA 
polymerase [Physarum polycephalum] 
Length = 547 

Score * 44.9 bits (104) , Expect = 0.002 

Identities = 118/545 (21%), Positives = 206/545 (37%), Gaps - 104/545 (19%) 

Query: 179 TSIATI/jKKLLDGGYLTESQLKTDFNYTIFDKDNDMNDSEAYDYAVKCFAKLTPEQLTYI 238 

T+LKLD+TQ F NM Y +CFL P++ I 

Sbjct: 65 TQLFNLLKSLQDSSFYTFKQ- - - FTYQNIM YSLEISCF- -LYPKKKILI 108 

Query: 239 HNDVI ILGMCHIHYSDIFPNFD YNKL- -TFSLNIMESY-LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ♦ + YN++ +++NI Y L+ +♦ + 

Sbjct: 109 -KDLYNFFSENIIYKDVVKDYTCIilAILYNEIQTAYNININRKYILSTASLSIiRIFKKSFP 167 

Query: 291 IKISYTOYHFHDMNFYT3YIKSFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVH 350 

K + D ♦ +YI+ YGGN I ♦ ♦ + ♦ D+NS YPY+M EK 

Sbjct: 168 EKYRLI PHLTRDED- -NYIRKSYIGGRNE I FEHVAQRNYFYDVNSLYPYIMKKEK 220 

Query: 351 IPTWLYFYEHYSEPTLIPTFLDD-DNYFS LYKIDKDVFNDDLL IKIKSRVLRQ 402 

+p + y + ♦ F + +N+F L I+K N +L + IK+ V 

Sbjct: 221 MPIGI PEYRDKEYMKKFEKNIENFFGFIDVLITIEKTNNNIPVXPYRMGIKNNV-EV 276 

Query: 403 MXXXXXXXXXXXXXXXXXXLRMIQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

L + Q 1+ IY + ++++F+ Y + 

Sbjct: 277 GIIYAKGTLRGIYFSEEIKLALKQGYKIIE IYSAYEYKEKEWFEEYVEQ 326 

Query: 463 TQGK-LKNKINMTSPYDYHITDDINEHPYSNEEVMLSKWLNGLYG IPALRS 513 

+ LK K D + D h K +LN LYG I + 

Sbjct: 327 MYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQIDI ISP 366 

Query: 514 HFNLFRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC 573 

L+DN++ ++ N++ ♦ ♦ FYT + ♦ IY 
Sbjct: 367 EKEL- - ITDNTYISHDTTEFIDITANTCYNNIAITSAITSYARIFMYNTILNYNLHVIYI 424 

Query: 574 DTDSLYMXSWKPLI^PSLFDPIAI/SKWDIEN^ 632 

DTD L++K+ P+ + +L +GK+ +E+ ♦ F+ N K Y Y +N I 

Sbjct- 425 DTDGLFLKN PI PDI ALTTS KEMGKFRLES INAEAHF IAN - KFYI YAP INS P I IYKFK 480 



Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAI I ENNKS I YNEQGT ISIYPSK 678 
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GI p N D + + +F+INNY+Q+ I Y ♦ 

Sbjct: 481 GI PLQKPI FNIKDI ITQHKKILNITLGHHYFTFS IRLNNNQTYSFQASRKRKLI PNYKTT 540 

Query: 679 TEIVC 683 
I+C 

Sbjct: 541 PWIIC 545 

. >gi|4877819|gb|AAD31446.l| (AF133505) DNA polymerase (Neurospora 
crassa] 
Length = 1035 

r. 

Score = 44.1 bits (102) , Expect = 0.004 

Identities = 36/172 (20%) , Positives » 82/172 (46%), Gaps = 14/172 (8%) 

Query 521 DDNNELYNIINGYKOTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYM 580 

+ N EL + ++G K+ I + + + ♦ ++ + + + + S Y DTDS+++ 

Sbjct: 817 EKNYELLSYLDGEKDDGFIINSTSIAAATASWSRILMYKHIINSA YTDTDSIFV 870 

Ouerv 581 KSWKPLLNPSLFDPIALGKWDIENEQIDKMFVliNllKKYAY 640 

+ KPL + + + K + +1+ ++KY + GK++I GI KN + 

Sbjct: 871 E KPLDSAFIGEGCGKFKAEYNGQLIKRAIFISGKLYLLDFGGKLEIKCKGITKNKDN 927 

Query 641 TSVDFETFVREQFFDG AIIENNKSIYNEQGTISIYPSKTEIVCGNVYDE 689 

T+ + + E + +G + + E GT+ + + K ++ G ' YD+ 

Sbjct: 92 8 TTHNLDINDFEALYNGESRVXFQERWGRSLELGTVTVKYQKYNLISG--YDK 977 

>gi|461962|sp|P33537|DPOMJJEUCR PROBABLE DNA POLYMERASE 

>g i|28335l|pirT|S26985 probable DNA-directed DNA 
polymerase (EC 2.7.7.7) - Neurospora crassa 
mitochondrion plasmid maranhar (SGC3) 
>gi|578156|emb|CAA39046| (X55361) putative DNA 
polymerase (Neurospora crassa) 
Length = 1021 



>gi|2499Sll|sp|Q1247l|6P22_YEAST 6 - PHOSPHOFRUCTO- 2 - KINASE 2 
(PHOSPHOFRUCTOKINASE 2 II) (6PF-2-K 2) 
>gi|2131162|pir| [S61066 6-phosphof ructo-2 -kinase (EC 
2.7.1.105) - yeast (Saccharorayces cerevisiae) 
>gi|2131163|pir| |S71026 6-phosphof ructo-2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cerevisiae) 
>gijl085116|emb|CAA6237l| (X90861) 
6-phosphofructo-2-kinase (Saccharomyces cerevisiae) 
>gi| 1420028 | emb| CAA99157 | (Z74878) ORF YOL136C 
(Saccharorayces cerevisiae) >gi| 162 8439 | emb|CAA64733 | 
(X95465) 6-phosphofructo-2-kinase [Saccharomyces 
cerevisiae] 
Length =397 

Score = 40.6 bits (93), Expect = 0.041 

Identities = 48/208 (23%), Positives = 92/208 (44%), Gaps » 29/208 (13%) 

Query 175 MKTNTS I ATLGKKLLDGGYLTESQLKTDFNYTI FDKDNDMNDSEAYDYAVKCFAKLTPEQ 234 

+ + S AT+ K LL L+ ♦ ♦ FN K*ND + + +A ++ T 
Sbjct: 139 IRRQISCATISKPLL LSNTSSEDLFN PKNNDKKET YARITLQK 181 

Query 235 LTY-IHNDVIILGMCHIHYSDIFPNFDYN^^ 290 

L + I+ND +G+ SI + F + S + +E++ F L+ Q 

Sbjcf 182 LFHEINNDECDVGIFDATNSTI ERRRFIFEEVCSFNTDELSSFNLVPIILQVSC 235 
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Ouerv 291 IKISYTHYHFHDWFY-DYIKSFTO^ 348 

S+ Y+ H+ +F DY+ Y + + + + F s+D N + Y+ H 

Sbjct: 236 FNRSFIKYNIHNKSFNEDYUDKPYEUVIKDFAKRLKHYYSQFTPFSLDEFNQIHRYISQH 295 

Query: 349 EKIPTWLYFYEHYSEPTLIPTFLDDDNY 376 

E+I T L+F+ + + P L+ +Y 
Sbjct: 296 EEIDTSLFFFNVINAGWEPHSLNQSHY 323 

>gi|2258375|gb|AAD11909.l| (AF007261) transcription initiation 
factor sigma [Reclinomonas americanal r . 
Length =532 

^^■l^lo^X^ ^os caps - W20S 

Query 100 NHFLLKDTMRYFDNITRENIYLKSAEENEHTLKMKEATILAKNQNVIL EKRVKSSIN 156 

y * M+ + + F + ++IY+ + +KE L K NVI+ K +K N 

Sbjct: 177 NYLVTCNSYLNLFKTVPHDSIYMNYSYIO/TPLNILKE^ -236 

Query- 157 LDLTMFLNGFKFNIIDNFM KTNTSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDND 213 

L++++FL F + M++ K ♦ + + K L Y+T L T Y K 
Sbjct: 237 LNISLFLYKFYQELKWNYIFINKISRNTQKINIKTLKNSYITFYNLITFIQYYTTKKQRL 296 

Query* 214 MNDSEAYDYAVTCCFAK--LTPEQLTYIKtTOVI^ 270 
W r * D +K F K P+ +N +1 G+ HI* + N K+T I 

Sbjct: 297 KIOIFYKQIFIKTFLKQHKIPKINKIKNNSLIKYGLTHIYDMILISILRENI 356 

Query- 271 MESYLNNEMTRFQLLNQYQDIKISY 295 

+ +Y+ T + QY +KI Y 
Sbjct: 357 IFNYMPYITT ISKQY- -VKIGY 376 

>gi|l5734|emb|CAA37450| (X53370) DNA polymerase (AA 1-575) 
[Bacteriophage phi-29] 
Length = 575 

Score = 39.5 bits (90), Expect * 0.092 . , fi/ie;n , 34% > 

Identities = 41/150 (27%), Positives = 64/150 (42%), Gaps = 36/150 (24%) 

Query- 497 L S KWLNG LYG IPAIJISHFNL-FRLDDNNELYNIINGYKNTERNIL- -F 542 

t ^Tf^+LN LYG +P L+ + L FRL G + T+ + 

Sbjct: 381 LAKLMLNS LYGKF ASN PD VTGKVP YLKENGALG FRL GEEETKDPVYTPM 429 

Query 543 STFVTSRSLYNLLVPFQYLTESB 602 

F+T+ + Y + Q D IYCDTDS+++ P + + DP LG W 

Sbjct: 430 GVF I TAWARYTT ITAAQACY DRIIYCDTDSIHLTGTEIPDVIKDIVDPKKLGYWA 484 

Query: 603 IENEQIDKMFVLNHKKYAY EVNGKI 627 

E+ ++ L K Y EV+GK+ 
Sbjct: 485 HES-TFKRVXYLRQKTYIQDIYMKEVDGKL 513 

Query= pt| 110872 44AHJDORF002 Phage 44AHJD ORF | 3789-5732 | 3 1 
(647 letters) 

>ai|l35273|sp|P27622|TAGC BACSU TEICHOIC ACID BIOSYNTHESIS PROTEIN C 
91 l > gi|478126|pirT|D49757 techoic acid biosynthesis protein 

tagC - Bacillus subtilis (strain 168) >gi|l43727 
(M57497) putative [Bacillus subtilis) 
>gi|2636103|emb|CAB15594.l| (Z99122) alternate gene 
name: dinC (Bacillus subtilisl 
Length =442 

Score = 112 bits (278), Expect = 7e-24 m B *\ 
Identities - 91/314 (28%), Positives = 147/314 (45%), Gaps = 58/314 (18%) 

Query- 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F + ♦ PK V QS N D+* ♦ +Y+TQ S ♦ ♦ I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVT^SFNFDEKNHQIYTTQVASGLGKDNTQSYT^ 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAY^NYVLDLEEA 262 

SM + GGHGT IG+E + NG + IW +D YK LD E ♦ 

Sbjct: 67 LQLDSMLLKHGGHG-raiGIENR-NGTIYIWSLYDKPNETDKSELVCFPYkAGATLD-ENS 124 
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Ouerv 263 KGLTDYTPQSLLNKHTFTPLIDEA^^^KLILRFGDGTIQVRSRADVKNHID^^/EKEMTIDN 322 
U y * K L ++ H TP +D N +L +R + D KN+ N ♦ + +TI N 

Sbjct: 125 KELQRFSNMPF- -DHRVTPAUDMKNRQLAIR QYDTKNN- -NNKQWVTIFN 170 

Ouerv 323 SE NNDN RWMQGIAVDGDDLYWLSGNSSVNSHVQIGKYSLTTGQKI 367 

+ U +N ++QG +D LYW +G+++ S + + ♦ 
Sbjct: 171 LDDAIANKNNPLYTINIPDELHYLQGFFLDDGYLYWYTGDTNSKSYPNL- ITV 222 

Ouerv 368 YDYPFKLSYQDGINFPRD-- NFKEPEGICIYTNPKTKRKSLLLAMTNGGGGKRFH 420- 

+D K+ Q I +D NF+EPEGIC+YTNP+T KSL++ +T+G G R 

Sbjct: 223 FDSDNKIVLQKEITVGKDLSTRYENNFREPEGICKYTNPETGAKSEjMVGITSGKEGNRIS 282 

Query: 421 NLYGFFQLGEYEHF 434 

+Y + YE+F 
Sbjct: 283 RIYAYH SYENF 293 

>gi| 142847 (M640S0) DNase inhibitor (Bacillus subtilis] 
Length =125 

Score = 51.9 bits (122), Expect = le-05 Ia ±\ 
Identities * 35/116 (30%), Positives - 55/116 (47%), Gaps = 10/116 (8%) 

Ouerv 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F+ + PK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVT^SFNFDEK^^ 66 

Query- 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAYKDMYVLD 258 

+ S M ♦ GGHGT IG+-E + NG ♦ IW >D ++L+ YK LD 

Sbjct: 67 LQLDS MLLKHGGHGTN I GMENR - NGT I Y I WS L YD KPNETD KS E L VC F P YKAGATLD 121 

>gi| 4038407 (AF103943) factor C protein precursor (Streptomyces 
griseus] 
Length = 324 

SSui^ ^^^^Positive" 102/W, (37*,. Caps - 33/2*9 

Query- 172 VNQSINIDKETNHMYSTQSDSQKPEG FWINKLTPSGDLISSMRIVQGGHGTTIGLER 228 

V OS D ++ Q S P+ I +L SG+ + M +♦ GHG + IG ♦ 
Sbjct: 66 VQQSFTFDIVNRRLFVAQLKSGSPDDSGDLCITQLDFSGNKLGHMYLLGFGHGVSIGAQ- 124 

Query- 229 QSNGEMKIWLHHIXjVAKLLQVAYKDNYVIjDLEEAKGLTDYTPQSLLNKHTFTP 281 

+ +W D + + + + GT SLKHP 

Sbjct: 125 PVGADTYLWTEVD VNSNARGTRLARFKWNNGATLS RTS S ALAKHQPVPGATEMTC 179 

Ouerv- 282 LIDEANDKLILRFGDGTIQVRSRADVKNHIDimiKEMTIDNSENtfDNRWMQGIAVDGDDL 341 

ID N+++ +R+ + ♦ +v + V + D QG A+ G + 

Sbjct: 180 AIDPVNNRMAIRYLTASGRRVGIYNVADIAAGVYDKPLSDVPHPTGLGTFQGYALYGSYV 239 

Query- 34 2 YWLSGN SSVNSHVQIGKYSLTTGQKIYDYPFKLSYQDGINFPRDNFKEPEGIC 394 

Y L+GN + HS+V + TG ♦ + ♦ G F+EPEG+ 
Sbjct: 240 YQLTGNPYGPDNPNPGNSYVS - -SVDVNTGALVQ RAFTRAGSTL TFREPEGMG 290 

Query: 395 IYTNPKTKRKSLLLAMTNGGGGKRFHNLY 423 

IY ♦ + L L +G G R NL+ 

Sbjct: 291 IYRTAAGEVR-LFLGFASGVAGDRRSNLF 318 

Query= pt| 110873 44AHJDORF003 Phage 44AHJD ORF |6S2S-8389|2 1 
(587 letters) 

>qi|l38123|9 P |P0433l|VG9 BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 
>ai 7S850|pirT|WMBPT9 gene 9 protein - phage phi-29 



>gi 
>gi 



215327 (M14782) tail protein [Bacteriophage phi-291 
225364|prf | |1301270D gene 9 (Bacillus sp.3 



Length = 599 

Score = 92.4 bits (226), Expect = 8e-18 M1% , 
Identities » 126/618 (20%) , Positives = 251/618 (40%) , Gaps = 71/618 (11%) 

Ouerv 5 TNFKFFYNTPFT-DYQOTIHFNSNKERDD^^ 62 

TN ♦ + PF+ DY+NT F S + + ++F R + + SK ♦ F ++ **V 
Sbjcf 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNWF - - NRKSRVYEMSKVTFMGFRENKPYVSVS 66 
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Query: 63 MQWHDAQGINYMTFLS-DFEDRRYYAFVNQIEYVNDWVKIYFVIDTIOT^ 121 

+ +Y+ F + D+ ++ +YAFV ++E+ N V ++F ID + T+ ++ 

Sbjct: 67 LPIDKLYSASYIMFQNADYGNKWFYAFVTELEFKNSAVTYVHFEIDVI^TWMFDMKFQES 126 

Query: 122 SNVNIERQHLSKRTYNYMLPMLRNNDDVLKVSNKNYVYNQMQQYLE 181 

I R+H+ K+P+D+L++++ + ++F S 

Sbjct: 127 F---IVREHV-KLWNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDDMMFLVIISKSIM 182 

Query: 182 FGT- -KKEPNLDTSKGTIYDNITSPVNLYVMEYGDFINFMDKMSAYPWITQNFQK V 235 

GT ++E L+ ++ + + P+ Y+ + + D +1 N V 

Sbjct: 183 HGTPGEEESRLNDINASL-NGMPQPLCYYIHPF YKDG£VPKTYIGDNNANLSPIV 236 

Query: 236 QMLPKDFINTKDLEDVKTSEKITGLKTLKQGGKSKEWSLK-DLSL SFSNLQ 285 

ML F + D+ + +T LK K+ + LK D + N+ 

Sbjct: 237 NMLTNIFSQKSAVNDI -VNMYVTDYIGLKLDYKNGDKELKIJ)KDMFEQAGIADDKHGNVD 295 

Query: 286 EMMLSK KDEFKHMIRNEYMTIEFYDWNGNTMLLDAGKISQK 326 

+ + K KD + ++ Y E D+ GN M L 1 + 

Sbjct: 296 T I FVKKI PDYEALEIDTGDKWGGFTKDQESKL^^MYPYCVTE ITDFKGNHMNLKTEYINNS 355 

Query: 327 TGVKLRTKSIIGYHNEVRVYPVDYNSAENDRPIIJUCNKEILIDTGSFI^^ 386 

+K++ + +G N+V DYN+ D + N+ S +N N 
Sbjct: 356 K- LKIQVRGS LGVSNKVAYS VQDYNA DSALSGGNRLTAS LDS S LINNNPN 404 

Query: 3 87 PILI^GILGQSQQANRQ--KNAESQLITNRIDNV^G---SDPKSRFYDAVSVASNLSP 441 

I I N L Q N+ +N +S + + N I ++ G + + A+ +AS + + 
Sbjct: 405 DIAILNDYLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISAGASAAGGSALGMASSV-- 462 

Query: 442 TALFGKFNEEYNFYKQQQAEYKDLALQPPSVTESEMGNAFQIANSINGLTMKISVPSPKE 501 

T + + QA+ D+A PP +T+ AF N G+ + + 

Sbjct: 463 TGMTSTAGNAVLQMQAMQAKQADI AN I PPQLTKMGGNTAFDYGNGYRGVYVI KKQLKAE Y 522 

Query: 502 ITFLQKYYMLFGFEVNDYNSFIEPIMS^CNYLKCTGTYTIRDIDPMLMEQLKAILESG 561 

L + + +G+++N + + NY+ + + DI + +++ + + I + +G 

Sbjct: 523 RRS LS S F FHKYG YK I NRVKK - - PNLRTRKAFNYVQTKDC F I SGD I NNND LQE I RT I FDNG 580 

Query: 562 VRFWHNDGSGNPMLQNPL 579 

+ WH D GN ++N L 
Sbjct: 581 ITLWHTDNIGNYSVENEL 598 

>gi| 138124 |sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi| 75849|pir| |WMBP9Z gene 9 protein - phage P2A 
>gi | 216058 (M11813) tail protein (Bacteriophage PZA] 
Length = 599 

Score = 81.9 bits (199), Expect = le-14 

Identities = 127/618 (20%) , Positives = 248/618 (39%), Gaps » 71/618 (11%) 

Query: 5 TNFKFFYNTPFT-DYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFIRDRME- INVD 62 

TN +. ♦ PF+ DY+NT F S+ + ++F + + SK + R+ . I+V 

Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYtJWF- -NSKTRVYEMSKVTFQGFRENKSYISVS 66 

Query: 63 MQWHDAQGINYMTFLS -DFEDRRYYAFVNQ I E YVNDVWKI YFVIDT IMTYTQGNVLEQL 121 

+ + +Y+ F + D+ ++ +YAFV ++EY N ++F ID + T+ N+ Q 

Sbjct: 67 LRLDLLYNASYIMFQNADYGNKWFYAFVTELEYKNVGTTYVHFEIDVLQTW - MFNIKFQE 125 

Query: 122 S NVN I ERQHLS KRT YNYML P MLRNNDD VL KV SNKNYVYN - - QMQQ YLEN L VL FQ S S AD LS 179 

S I R+H+ K + P + D+L ++ + + + Y + + L S + 
Sbjct: 126 SF--IVREHV-KLWNDDGTPTINTIDEGLNYGSEYDIVSVENHRPYDDMMFLWISKSIM 182 

Query: 180 KKFGTKKEPNLDTSKGTIYDNITSPVNLYVMEY GD FINFMDK 221 

+ E L+ ++ + + P+ Y+ + GD +N + 

Sbjct: 183 HGTAGEAESRLNDINASL-NGMPQPLCYYIHPFYKDGKVPKTFIGDNNANLSPIVNMLTN 241 

.Query: 222 MSAYPWITQNFQ KVQML P KD F I NT K DLEDVKTSEKITGLKTLKQGGKSKEWS 273 

+ + N V M D+I K +L+ K + G+ KG + 

Sbjct: 242 IFSQKSAVNNI - -VNMYVTDYIGLKLDYKNGDKELKLDKDMFEQAGIADDKHGNVDTIFV 299 

Query- 274 LKDL SLSFSNLQEMhU^SKKDEFKHMIRNEYKTIEFYDWNGNTMLLDAGKISQKTGVK 330 

K + L + KD+ ++ Y E D+ GN M L I +K 

Sbjct: 300 KKIPDYETLEIDTGpKWGGFTKDQESKLMMYPYCVTEVTO ^58 

Query: 331 LRTKSIIGYHNEWVYPVDYNSAENDRPIIAKNKEILIDTGSFUTTNITFNSFAQVPILI 390 
+♦ + +G N+V DYN+ + L+ + L+T++ N+ + 1 + 
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Sbjct: 359 IQVRGSLGVSNKVAYSIQDYNAGGS LSGGDRLTAS- - - -LDTSLINNNPNDIAII - 409 

Query- 391 NNG I LGQSQQANRQ - - KNAESQL ITNR I DNVLNGSD PKSRFYD AVS VASNLS P 441 

N L Q N+ +N +S N I +L G A + A SP 

Sbjct: 410 - ND YLS AYLQGNKNS LENQKSS I LFNG I VGMLGGG VSAGASAVGRSPFGLASSV 462 

Ouerv 442 TALFGKFNEEYNFYKQQQAEYKDLAIiQP PSVTESEMGNAFQI ANS INGLTMKI SVPS PKE 501 

. x + ♦ QA+ D+A PP VT+ AF N G+ ♦ * 

Sbjct: 463 TGMTSTAGNAVLDMQALQAKQAD IAN I P PQLTKMGGNTAFD YGNG YRGVYV I KKQLKAE Y 522 

Ouerv 502 ITFLQKYWLFGFEVNDYNSFIEPINSMTVCOTLKCTGTYTIRDIDPMLMEQLKAILESG 561 
^ • L ++ +G+++N * + NY>+ + ,DI* +++ + + I ++G 

Sbjct: 523 RRSLSSFFHKYGYKiraVKK--P^TRKAY^^ 580 

Query: 562 VRFWHNDGSGNPMLQNPL 57 9 

+ WH D GN ++N L 
Sbjct: 581 ITLWHTDDIGNYSVENEL 598 

>gi|l429238|emb|CAA67657| (X99260) tail protein (Bacteriophage B103] 
Length = 598 

Score = 77.6 bits (188), Expect = 2e-13 flC/ „, , n n 

Identities = 130/623 (20%), Positives - 240/623 (37%), Gaps * 86/623 (13%) 

Query- 5 TNFKFFYNTPFT-DYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFI RDRMEIN 60 

T+ + F N PF+ DY++T F + + YF ♦ K ♦ NF+ 1 

Sbjct: 9 TDVRI FSNVPFSNDYKSTRWFTNADAQYS YF NAKPRVHVINECNFVGLKEGT PH I R 64 

Ouerv- 61 VDMQWHD AQG I NYMT FLS - D FED RRYYAFVNQ I E YVND VVVKI YFV I DT I MT YTQGKVLE 119 

V+ + D YM F + + ++ +Y FV ++EYVN V +YF ID I T* ♦ 

Sbjct: 65 VNKRIDDLYNACYMIFRNTQYSNKWFYC^ 123 

Query- 120 QLStmUERQHLSKRTYNYMLPML^ 179 

Q S + E Q + P+ D+ L + V Q +>F 

Sbjct: 124 QPSYIVREHQEMWDANNE pLTNTIDEGLNYGTEYDWAVEQYKPYGDLMFMVCISKS 180 

Ouerv 180 KKFGTKKEPNLDTSKGTIYDNITS - - - PVNLYVMEYGDFINFMDKMSAYPWITQNFQKVQ 23 6 

K T E G I NI P++ YV + + D S P +T +VQ 

Sbjct: 181 KMHATAGET FKAGEIAANINGAPQPLSYYVHPF YEDGSS - - PKVTIGSNEVQ 230 

Ouerv 237 ML-PKDFINTKDLEDVKTSEKITGLKT LKQGGKSKEWS LKDLS LS FSNL 284 

f PDF + ++ + ++ T ♦ +K SL+D + + 

Sbjct: 231 VSKPTDFLKNMFTQEHAVNNIVSLYVTDYIGLNIHYDESAKTMSLRDTMFEHAQIADDKH 290 

n,,«™. 595 QEMMLSKKDEFKHMIRNEYMTIEFY- DWNGNTM LLDAG K 322 

QUery< +E + +F NE + Y D + GN + > 

Sbjct: 291 PtTVOTIYLKEVKEYEEKTIDTGYKFASFANNEQ 350 

Query 323 ISQKTGVTCIJlTKSIIGYTiNEVRVYPVDYNS AENDRPILAKNKEILIDTGSFLNTNIT 379 
+ + + +K++ + +G MW DYN+ D+ + A 'J£t%\ 9 m 
Sbjct: 351 VNG-SNLKIQWGSLGVSNKVTYSVQDYNADTTLSGDQNLTAS — CNTSLI 398 

Query 380 FNSFAQVPILINNGII/3QSQQANRQ- - KNAESQLITNRIDNVLN GSDPKSRFYDAVS 434 
v 1+ N L Q N+ +N + + + N + ++L G+ + AV 



N+ v 1+ N L QN+ +N + ++ N + ++L t,+ + 
Sbjct: 399 NNNPNDVAII - -NDYI*SAYLQGNKNSLENQKDSILFNGVT^MI/3NGIGAVGSAATGSAVG 456 

Query 435 VASNLSPTALFGKFNEEYNFYKQQ^AEYKDLAIiQPPSVTESEMGNAFQIANSINGLTMKI 494 

VAS S T + + QA+ D+A PP + + A+ N G+ + 

Sbjct: 4S7 VAS - - S ATGMVS S AGNAVLQ I QGMQAKQAD I ANT P PQLVKMGGNT A YDYGNG YRGVYVI K 514 

Ouerv 495 SVPSPKEITFLQKYYMLFGFEVIJDYNSFIEPINSMTVCNYLKCTGTYTIRDIDP S54 
^ ^' + t, + +G++ N + + + I +++ ♦♦♦♦ 

Sbjct- 515 KQIKEEYRNILSDFSRKYGYTCTNLVTC--MPNLRTRESYNYVQTK^ 572 



Query: 555 KAI LE SG VRFWHNDGSGN P MLQN 577 

♦ I +SG+ WH D G+ L N 
Sbjct: 573 RTIFDSGITLWHADPVGDYTLNN 595 



>qi I 215339 (M12456) p9 tail protein [Bacteriophage phi-29] 

>gi|224163|prf | |1011232C protein p9, tail [Bacteriophage 



phi-29] 
Length » 335 
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Score = 71.0 bits (171), Expect - 2e-ll 

Identities = 64/293 (21%), Positives * 123/293 (41%), Gaps * 20/293 (6%) 

Query 292 KDEFKHMIRNEYMTIEFYDWNGNTMLLDAGKISQKTGVKLRTKS I IGYHNEVRVYPVDYN 351 

KD> Y E D+ GN M L 1+ +K>+ + +G N+V DYN 

Sbjct: 57 KDQESKIJ^MYPYCVTEITDFKGNHMNLKTEYI^SK-LKIQVRGSLCVSNKVAYSVQDW 115 

Ouerv- 352 SAENDRPILAKNKEILIDTGSFLNTNITFNSFAQVPILINNGI LGQSQQANRQ- -KNAES 409 

+ D > N+ S +N N I I N L Q N+ +N +S 

Sbjct: 116 A DSALSGGNRLTASLDSSLINNNPN DIAILNDYLSAYLQGNKNSLENQKS 165 

Query 410 QLITNRIDNVLNG SDPKSRFYDAVSVASNLSPTALFGKFJIEEYNFYKQQQAEYKDLA 466 

++ N I ++ G +• + A+ +AS+ + T ♦ + QA+ D+A 

Sbjct: 166 S I LFNG IMGMIGGG I SAGAS AAGGSALGMASSV - -TGMTSTAGNAVLQMQAMQAKQAD I A 223 

Query 467 LQPPSVTESEMGNAFQIANSINGLTMKISVPSPKEITFLQKYYMLFGFEVNDYNSFIEPI 526 
* pp +T + AF N G+ + + L ++ +G+++N + 

Sbjct: 224 NIPPQLTKMGGNTAFDYGKGYRGVYVIKKQLKAEYRRSLSSFFHKYGYKINRVKK--PNL 281 

Query 527 NS^CNYLKCTGTYTIRDIDPMI^EQLKAILESGVRFWHNIXSSGNPMUJNPL 579 

+ ny++ + DI + + ++++ I ++G+ WHD GM ++N L 
Sbjct: 282 RTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNIGNYSVENEL 334 - 

>gi|H81968|emb|CAA87738.l| (Z47794) tail protein [Bacteriophage 
CP-1] 

Length * 230 
Score - 53.9 bits (127), Expect = 3e-06 

Identities = 29/113 (25%), Positives = 54/113 (47%), Gaps =* 3/113 (2%) 

Query: 1 MRKLTNFKFFYNTPF-TDYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFIRDRMEI 59 

M++ T + +PF DY N I+F + + +D+F ♦ Y + + + I 

Sbjct: 1 MQESTKIWLYAKSPFKNDYANVINFETRESMEDFFTKKNPHIEIVYEYDKFQYTQRNGSI 60 

Query 60 NVTDMQWHDAQGIhT¥MTFLSDFEDRRYYA 112 

V + + + YM F+++ R YYAFV + Y+N+ +1 + +D TY 
Sbjct: 61 WSG RVE KYENVTYMRF I NN - - GRT YYAFVFD VLY I NED ATR 1 1 YEVD VWNTY 111 

>gi| 1181970 |emb|CAA87740.l| (Z47794) tail protein [Bacteriophage 
CP-1) 

Length =586 
Score * 42.2 bits (97), Expect = 0.010 

Identities = 79/381 (20%), Positives - 139/381 (35%), Gaps = 92/381 (24%) 

Query 277 LSLSFSNLQEMMLSK- - KDEFK HMIRNEYKTIEFYDWNGNTMLLDAG KISQKT 327 

L +QE + S KD+ + ++ +E+ IE YD GN+ + I ♦ 

Sbjct: 187 LKIAYDQIQEGLRSYMGKDDLEIEVQLLNSEFTEIELYDIYGNSYVYQPQYLPRTIDE1AH 246 

Query: 328 GVKLRT KS 1 1 G YHNEVRVY P VD YNS AEN DRPIL 360 

K+ +G N+V + ++YN+A N D+ IL 

Sbjct: 247 KYKVIVSGSI/SDSNQVmNFLEYNNAN^^ 306 

Query 361 -AKNKEILIDT-GS FLNTNITFNSFAQVPILINNGILGQSQQANRQKNAESQLITNRIDN 418 

Kt ILD S++ Q+ N +LQS + ++ A + + 

Sbjct: 307 TG KS V AI LND AEAS Y I QS H KNQMEHTQ LT F KENRD MLKQS VD LS NKQ VAT AN S Q AS YNAQ 366 

Query 419 VLNG S D P KS RFYDAVS VASNLS PT ALFG KF NEEYNFYKQQQ- - 459 

s +++ + S N++ L G F N +YN QQ 

Sbjct: 367 FAVDSAN INQWTEGASGI LNVAGNLLTGNFGGAIX3GLASGGMKVFNANRDYNDKVVQQGF 426 

- w . ' AEYKDLALQPPSVTESEMGNAFQIANSIN 488 

Query: ---- ^ ^ Qp ^ + AFQ N + 

Sbjct: 427 T S ENNAL KS QS N ALANMKS KI ALDQ S I RA YNATMAD LQNQ P I S VQQ IGND LAFQ SG NRLT 486 

Query 489 GLTMKISVPSPKEITFLQKYYMLFGFEvTTOY-NSFIEPINSMTVCNYLKCTGTY- -TIRD 545 

+ K+S+ + > +Y +G VN + N ♦ + S NY*K T + R 
Sbjct: 487 DVYWKVS LAQKE IMGRANEYI KCYGVLVNWFTNDALS VMRSRKRFNY I KMINVNLGTLR- 545 

Query: 546 I D PMLMEQLKAI LESGVRFWH 566 

+ M ++AI +SGVR W+ 
Sbjct: 546 ANQSHMNAIQAI FQSGVRIWN 566 
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Query* pt|H0875 44AHJDORF005 Phage 44AHJD ORF 1 12643 -13890 1 -1 1 
(415 letters) 

>gi|3845203 (AE001399) GAF domain protein {cyclic nt signal 
transduct.) [Plasmodium falciparum] 
Length = 124 5 

Score - 52.3 bits (123) ( Expect * 6e-06 MAtpl 
Identities = 59/246 (23%), Positives . 105/246 (41%), Gaps - 27/246 (10%) 

Ouerv 174 ESIDRNHG^YIGFPKMFLLGNAVNFSSPI^ 233 

+ S D N+ N + ♦ N + V FS> N IY++L N +YK + E + 

Sbjct: 854 DSSDNNNNNNNNNNNNNNYNNW NEKIYDML- - NRDNIYKKVKKEIF 904 

Ouerv 234 RNDYVNEKRNTRAFNS^OAWTTGEFEFNEYNIADDNLRNHINQNGDFFY 291 
Query. ^ ^ ^ + +N + M ■ + N N ++N+ N+ N NGD Y KY 

Sbjct: 905 EGDSIIKTMENKPNLTNKNYMNNDNIDNNN 964 

Ouerv 292 KVMYNVTTFMTNI IWPYTKQYEFCTKIR*DID^^^VTYLRDDMFYKE^^MERYYYNPSN^ 350 
Query. + + ♦ KE K+ I + L +F+K KM + ♦ L+ 

Sbjct: 965 TSIFNKDLYVKHFVDIIMNKSLEEIIKMNVYISERINSL LFHKGNM LNDVTKLY 1018 

Ouerv 351 FDNA YS KNYWDNDRYLYLDMNKI I KFH I KNEMKKNMS EFERKEKI YEDN YI ENTK 406 

NAY + N KIP + E K +M F+ +KIY+N * NK 

Sbjct: 1019 MSNAYGEKCFFFN FPQIKEIIFVNEYEKKMDMKYFKMLKKIYKYNLNKIFSNNYK 1073 

Query: 407 KYLMKQ 412 
+ ++K+ 

Sbjct: 1074 FFIIKK 1079 

>gi|3758843|emb|CAB11128.l| (Z98551) predicted using hexExon; 

MAL3P6.23 ( PFC0820w) Hypothetical protein, len: 4982 aa 
[Plasmodium falciparum] 
Length = 4 981 

Score = 49.2 bits (115), Expect = 5e-05 -« # - 0 «i t™*\ 

Identities « 67/287 (23%), Positives « 110/287 (37%), Gaps - 60/287 (20%) 

Ouerv- 127 ITDLNSATDLKYHSNFLKHYPI 1 1 YOEFLALEDOYLIDEWDKLKT ■ I YES IDRNHGN 182 

T n+N + D+ + +++ I YD +++DK++ IY + ID++ N 

Sbjct: 3619 IMDINKSKDI SKNMEIVQS- - - IEYD — -NKYDKIRNDMDAIYMAIDKDMDN 3664 

Ouerv 183 VDYIGFPKMFLLGNAVNFSSPILSNIiNIYNL -LQKHKMNTSRLYKNIFLEMRRNDYV 238 
Query. + ^ ^ ^ g +(j ^ '++ K N R Y N F +D 

Sbjct: 3665 IGI INCMRYFNLYKNYNNLSNEQnOlE-YNL^ELYMEDIKRhMKR-YDtTOFNINHYDDNN 3722 

Ouerv 239 KEKSUTrRAFNSNDDAMTTGBFEFMEYNLADDMIJRNHINQNGDFFYIKTDDKYIKVMYMVT 298 

^' H N N + N + + N N ++N N* M NG F+ D 
Sbjct: 3723 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNW 37 ' s - 

Ouerv 299 TFMTNIIWPYTKQYEFCTKIRDIDMHVTYLHDDMFYKENMERYYYNPSNLHFDNAYSKN 3S8 

K FCTK + + F +N+E N N N Y* N 

Sbjct: 3772 KDLFFCTK KNIFPCKNIETVCKNEYNKKIYNNYTCN 3807 

Ouerv 359 YWDNDRYLYLDMNKI I KFH I KNEMKKNMS EFERKEK-I YEDNYI EN 404 
UUery< V4 .„ + ++ ik ♦ + N E+ ♦ EK +Y + EN 

Sbjct: 3808 ISVNNTUJCLNIIKELIKLNNNIOCKILNYYEYHKVEKLLYYRHSFEN 3854 

Score = 3S.6 bits (80). Expect = 0.70 «,,«„ ,,,%) 

Identities - 62/290 (21%), Positives - 121/290 (41%), Gaps = 65/290 (22%) 

Ouerv 2 VKQNRLXlMVRDYQNAVN- -HVPJGCIPDKYNQIELVDELMNDDIDYYISISNP^DGKSFNY 59 

' +K+N +♦ +N +N +V++ DK N I D++I + SN + +SF 

Sbjct: 4445 IKRNNINKSNIKRNNINKSNVKRSNTDKSNVIS DFHIT-SNNNITRSFT- 4492 

Ouerv 60 vSFFIYLAIKLDIKfTLLSRHYTLRDAYRDFIEEIIDENPLFKSKRVTFRSARDYlAIIY 119 

A D F LS TL +Y +F ♦ + ~ 
Sbjct: 4493 -- ATLTDSIFNTLSE--TLNYSYDNFFSNMDN - I" 4523 

Query 120 QDKEIGVITDUNSATDLKYHSNFLKHYPIIIYDEFL ALEDDYIilDEWDKIJCTIYE 174 

Query. uuiu^ ^ + +e++ + +d + de ++t+ e 

Sbjct- 4524 KKNEINNITDVT5YGNKKEYHENYLKVKQNKVNEEYIEETFKSDKDCSIKDEACTIRTLSE 4S83 
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Query: 175 S - - IDRNHGNVDYIGFPKMFLLGNAVNFSSPILSNLNIYNLLQKHKMN- -TSRLYKNIFL 230 

SIN N + D + + ♦ S P N++ N ++K+ +N R+ KN 

Sbjct: 4584 SCNISENISNID f^DEDHISFPNGRNVHDNNYMKKNHVNYDKMRVGKNKIP 4634 

Query: 231 EMRRNDYVNEKRNTRAFNSNDDAMTTGEFEFNEYNl^ 280 

D + ♦ +D M+4- ++ E ++ + L + NG+ 
Sbjct: 4635 SFTHFDKILDEKKKK SDiCDMSSSKWLEREEHIKEIKLEKNEYMNGN 4680 

Score = 34.0 bits (76), Expect » 2.0 

Identities = 47/211 (22%), Positives = 84/211 (39%) , .Gaps = 32/211 (15%) 

Query: 210 IYNLLQKHKMOTSRLYKNIFLEMRiW^ 269 

I++LLQK LY+N+ + R + N+ T E ++ + + + 

Sbjct: 918 IFSLLQKDSSPLLVLYENVHI REGEKYGRNE - - ATDNEVDYKKGD 1 1 KH 964 

Query: 270 NLRNHINQNGDFFYlKTD DKYIKVTfYNVTTFMTNII WPYTKQYEFCTKIRDIDNHV 326 

N+ N + 0 + 0+ K MY + V E K D+ N+ 

Sbjct: 965 NVTNEHGNHSDSYPYGNSLNLDRKPKNMYE - DI YKEKGFVKSDCSNI E I - - KKNDMINND 1021 



Query: 327 TYLRDDMFYKENMERYYYNPSNIJiFDNAYSKNYVVDNDRYLYLDMNKII KFHIKNE 382 

y +++ FY+++ Y+ + YV++ +YL +N ++ F +KN+ 

Sbjct: 1022 VYKKNE-FYEDSRINMIYDEDEIKTWFLIPHKYVIN IIYLFLNILLTDESNFKLKNK 1077 

Query: 383 MKKNMSEFERKEKIYEDN YIENTKKY 408 

E K IYEDN ++N KKY 

Sbjct: 1078 KYGYFVNEETKGTIYEDNNGLQEILKNGKKY 1108 

Score = 33.6 bits (75), Expect =2.7 

Identities = 42/198 (21%), Positives = 77/198 (38%), Gaps - 42/198 (21%) 

Query: 222 SRLYKNIFLEMR RNDYVNEKRNTRAF NSNDDAMTTGEFEFNEYNLA 267 

S LY I + + + 4N + K+NT + N+++D TT E + + 

Sbjct: 411 SVLYSIIYWNKKYKKKNFIITNKKNTNW^ 470 

Query: 268 DDNLRNHINQNGDFFYIKTDDKYIKVKYNVTTFM™ 327 

+ ++R +N D +DDK ++Y N YTK E 
Sbjct: 471 MNDMRYSVNNYA^EKVYTiSDDKSDHLIYKHVHD^ 517 

Query: 328 YLRDDMFYKENMERYYYNPSNLHFDNAYS KNYVVDNDRYLYLDMNKI I KFHI KNEMKKNM 387 

+++ YK N+ + N K LD+ K I H+KN+ ♦ N 

Sbjct: 518 - -NENI IYKSNIVDKKTCDISSEMVNGKDK LDVEKYIGSHVKND-ENNK 563 

Query: 388 SEFERK-EKIYEDNYIEN 404 

+ ++ k ♦ + + YI+N 
Sbjct: 564 E KLKKK I DNVNKKE Y I DN 581 

>gi | 3845297 (AE001421) hypothetical protein [Plasmodium falciparum] 
Length = 2380 

Score = 48.0 bits (112), Expect = le-04 

Identities - 87/390 (22%), Positives =160/390 (40%), Gaps = 65/390 (16%) 

Query: 20 V^KKIPDK^QIELVDEL^ftTODIDYYISISNRSDGKSFNYVSFF IYLAIKLDIKF 74 

♦ +K ++ ♦ +N D + ♦+ R K+ NY++ +YL I DI 

Sbjct: 1049 LQRKNMNKCSKNRNRNRYINIQ^SNIHIiMNLIRIK^ HO* 

Query: 75 TLLSRKYTLRDAYR DFIEEI IDEN- PLFKSKRVTFRSARDYLAI IYQDKEIGVI 127 

+Y +++ Y + + + EN ♦ ♦ ♦+ + Y +K+ 

Sbjct: 1109 QFWKHNYlJVQNFYNFSITLINIMSKYYSENFYAYm,EKIvTKFLI^KNFEYIEKQYSSK 1168 

Query: 128 TDLNSATDLKYHSNFLKHYPIIIYDEFLA LEDD YLI DEWDKLKT I YES I DRNHGNV 183 

D+N D+ ++ +K+ II EFL L+ D I ♦ KLKT + + 

Sbjct: 1169 EDMNEL-DILVNTYUMKYDKI I EFLKNNGYLKIDRYI YFYPKLKT ----DI 1214 

Query: 184 DYIGFPKMFLLGNAVNFSSPILSNLNIYNLLQKHKMNTSRLY KNIF- -LEMRRN 235 

F ++FL N + h NI ++♦ K ♦ Y K IF + M+ + 

Sbjct: 1215 ILFFFKEIFt^NILKIDRKFLKK-NITIMIEVXKEIFFKE^ 1273 

Query- 236 DYVNEKR - -^RAFNSNDDAKTTGEFEFNEYNLADDNLRNHINQNGDFFYIKTD 287 

D+V K N+ FN* D + N YN D+ N+ N N +Y K 
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Sbjct: 1274 DHVMNKNYYNNQYVNNStWFOT^ 1332 

Query: 288 DKYIKVMYNVTTFMTNIIV VPYTKQYEFCTKIRDIDNHVTYLRDDMFYKEN ME 340 

+K K+MY + V ■ K + K I + Y+++ N + 

Sbjct: 1333 NKN-KIMYEKERKSSSLFISNNVQDVKPIKHYLKYSSIYKNFIYIISEIKNFNNKITKIN 1391 

Query: 341 R Y - YYN P SN LH FDNAYS KNYWDNDR YL Y L -369 

RY YYN NL+ D+ ND YL+L 

Sbjct: 13 92 RYNYYNYMNLNIDDL NDAYLFL 1413 

Score = 32.5 bits (72), Expect * 6.0 

Identities = 46/183 (25%), Positives » 73/183 (39%), Gaps = 26/183 (14%) 

Query: 225 YKNIFLEHRRNDYVNEKJINTRAFNSNDD^ 284 

+ KNI ++ + +N ♦ NSN + + N N+ +N N IN + I 

Sbjct: 27 HKNINKNIKNKKFINIDNSNNCNNSNSNNSNSNNNNNNNNNIV^ BS 

Query: 285 KTDDKYIKVHYNVTTFMTNIIWPYTKQYEFCTKIRDIDNHW 344 

+D IK V NI Y +*• > D+ N+ + + KE ER 

Sbjct: 86 LNEDDDIKNKELVDESFVNIFF- -YENYFKNLFNLNDVSNNKVI- -NIIEQKEGDER 13 8 

Query: 345 NPSNUiFDNAYSKNYVVTOIDRYLYLDMNKI^^ 404 

N N N +KN V DN +NK IKN +N++E Y N++ + 

Sbjct: 139 NADN NLKNKNI VRDN INK IKN- -TRNVNEILIYNNKYI INFLND 180 

Query: 405 TKK 407 
T K 

Sbjct: 181 TTK 183 

>gi| 4493936] emb| CAB38972 . 1 | (AL034556) predicted uaing hexExon; 

MAL3P5.6 (PFC0600w), Hypothetical protein, len: 250 aa 
[Plasmodium falciparum) 
Length =24 9 

Score = 47.3 bits (110), Expect = 2e-04 

Identities * 53/215 (24%), Positives = 87/215 (39%), Gaps = 30/215 (13%) 

Query: 209 NIYNLI^KHKMNTSRLYKNlFLEMRRNDYWEKRNTRAra^ 266 

NIYN L+ + YKN N ++ +N N+N EFE N YN 

Sbjct: 13 NIYNKLEEK YKNFLKLKNMNSHMGASQNMNV-NNKYTMNELEEFEKINNNYNN 64 

Query: 267 ADDNLRNHINQNGDFFYIKTD DKYIKVMYNVTTFMTNI IVVPYTKQYEFCTKIRD 321 

♦ +N+ N+IN D+ IK +K ++ YN +1 T +++ 

Sbjct: 65 NNNN I NNN I NNYYD YMN I KVSQS VQHNKRLQD FYNNKNS FQHY I KKLKTCRFDADD I RNL 124 

Query: 322 I DNHVTYLRDDMFYK ENMERYYYNPSNLH FDNAYS KNYWDNDRYLYLDMNKI I K 376 

++ +YRD+ K EN+ N+ N+SNY DN+ LY +N++ K 

Sbjct: 125 LEKRLAYERDNTLIKNIQEEENKKGIGINGNFGSESNSSSSNY- -DNNYLLYRKINRLNK 182 

Query: 377 FHIKNEMKKNMSEFERKEKIYEDNYIENTKKYLMK 411 

+ ++ KI KKY++K 
Sbjct: 183 TNTNKSKNRSRKRKRINSKI DKKYIIK 209 

>gi|384S165 (AE001390) hypothetical protein (Plasmodium falciparum] 
Length = 1247 

Score = 45.7 bits (106), Expect = 6e-04 

Identities = 52/239 (21%), Positives * 94/239 (38%), Gaps » 38/239 (15%) 

Query 206 SNLNIYNLLQKHKMNTSRLYKNIFLEMRRNDYVIJEKRN^ 265 

+N N +N ++K K R I >N ♦ +N ++N+D E N N 

Sbjct: 474 NNTNKWNEIKKRKX^FKREKNKIIWSFQ 533 

Query: 266 LADDNLRNHINQNGDFFYI-KTDDKYIK VMYNVTTFMTNI I WPYTKQYEFCTKIR 320 

D+N N+ + N D I D+ Y +YN T ++ YTK + + + 

Sbjct: 534 NNDNNNENNNDINNDINNIHNNDNNYYN^ 592 

Query: 321 DIDNHVTYLRDDMFYKENME ■- -RYYYN PSNLHFDNAYS 356 

+ + ++ + FY++N + ++YYN + N 

Sbjct: 593 DMLPS I KFETFYEKNTDHKNFNENYKFYYNTDDDTDI INAI KKKNVKNKKXNGNIVI 649 

Query: 357 KNYvVDNDRYLYLDMNKIIKFHIKNEMKKNMSEFER KEKIYEDNYIENTKKYLMK 411 

KNY+ N+ Y YL+ N+ + I + K +E K+ !♦ -f+Y E K K 
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Sbjct: 650 KNYINHNE-YSYLEYNENKNYEINKKEKLLTEOT^^ 707 



Score » 41.0 bits (94), Expect * 0.016 

Identities » 58/245 (23%), Positives » 96/245 (38%) , Gaps » 43/245 (17%) 

Query: 207 NLNIYNLLQKHKMNTSRLYKNIFLEMRR^ 266 

N+N+YN ♦ K K YF+D+ + ♦ N D E YN 

Sbjct: 564 NINLYNEffTKKKCMLDNSYTKYFFYIFTLDMLPSIKFETFYEKNTDHKNFNENYKFYYNT 623 

Query: 267 ADD NLRNHINQNGDFF- - -YIKTDDKYIKVMYNVT-TFMTNI IWPYTKQ 312 

DD N++N +NG+ YI ++ Y + YH. + N T+ 

Sbjct: 624 DDDTDIINAIKKKNVKNK-KKNGNIVIKNYIN^^ 

Query: 313 YEFCTKIRDIDNHVTYLRDDMFYKENMERYYYNPSNLHFDNAYSK NYV--VD 362 

YE+ I+D ++ Y D + + YN +N +N Y K +Y+ VD 

Sbjct: 682 YEYDWYIKDNIHYNDYSEGDGKQTKKASSFLYNNNN NNKYKKEDNKTQI ISYMDHVD 738 

Query: 363 NDR -YLYLDMNKIIKFHIK-NEM -KKNMSEFERKEKIYEDNYIENTKKY 408 

K+ Y + +++ F +K N+M K+ F +E I + +EN K+ 

Sbjct: 739 NENGVKGLKKRNLFYNNSDQLYNFDVKDNDMIKYEKRQSKNFVEEEFING>^ 798 

Query: 4 09 LMKQY 413 

L K Y 
Sbjct: 799 LKKHY 803 



Query* pt| 110877 44AHJDORF007 Phage 44AHJD ORF | 2044-3027 |l 1 
(327 letters) 

>gi|H81960|emb|CAA87731.l| (Z47794) connector protein 
[Bacteriophage CP-1] 
Length = 337 

Score » 45.7 bits (106), Expect » 5e-04 

Identities » 44/184 (23%), Positives = 84/184 (44%), Gaps = 13/184 (7%) 

Query: 127 QIHKLYDNCMSGNFVVMQNKPIQYNSDIEIIEHYTDELAEVALSRFSLIMQAKFSK- -IF 184 

+ +HK + + +V+ N Y I +E ♦ ++LA++ L+ L A+ ♦ IF 
Sbjct: 125 ELHKDNPDKIKRPCIVIPNNNF-YEPYIGYXELFCEKLADIELT-IQLNRNAQITPYFIF 182 

Query: 185 KSEINDESINQLVSEIYNGAPFVXMSPMFNAD DD 1 1 DLTSNSVI PALTEMKR 236 

N S+ + + + I MPV+++D DI+ L++ 

Sbjct: 183 ADNTNVLSMKNIFNKIANFEPVVYLNKQKDQDGQ 242 

Query: 237 EYQNKISELSNYLGINSLAVDKESGVSDEEAKSNRGFTTSNSNIYLKGREP- ITFLSKRY 295 

E +++L ++GIN+ DK+ + EA SN G ++N + K R + ++K Y 
Sbjct: 243 EKIAVTONQLLTFIGINNNPSDKKERLWSEAISNN^ 302 

Query: 296 GLDI 299 
GL+I 

Sbjct: 303 GLEI 306 

>gi|l429239|emb|CAA67658| (X99260) upper collar protein 
[Bacteriophage B103] 
Length = 308 

Score = 44.9 bits (104), Expect = 8e-04 

Identities = 40/159 (25%), Positives = 73/159 (45%), Gaps * 11/159 (6%) 

Query: 150 YNSDIEI IEHYTDELAEVA-LSRFSLIMQAKFSKIFKSEINDESINQLVSEIYNG 203 

YN+D++ +E + +LAE+ + + Q I ++ N S+ + ++ 

Sbjct: 121 YNNDLKCSTLPALEMFAQDLAELKEIIAV1IQNAQKTPVLIAANDNNQLSLKNIYNQYEGN 180 

Query: 204 APFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESGV 262 

AP ♦ + + D+ + + V> L K N E+ YLGI ♦ ++K+ + 
Sbjct: 181 APVIFvT*ESLDLDNLKVFKTDAPYvVDKLNAQKNAVWN EVMTYLGIKNANLEKKERM 237 

Query: 263 SDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 

E SN S+ NIYLK R E +S+ YGL++K 

Sbjct: 238 VTSEVDSNDEQIESSGNIYLKARQEACNKISELYGLNLK 276 

>gi|l37915|sp|P0753S|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE PROTEIN GP10) >gi | 75851 |pir| |WMBP10 gene 



317 



10 protein - phage PZA >gi| 216059 (M11813) upper collar 
protein (Bacteriophage PZA] 
Length = 30 9 

Score = 43.8 bits (101), Expect = 0.002 

Identities = 38/160 (23%), Positives = 75/160 (46%), Gaps = 13/160 (8%) 



Query: 150 YNSDIEI IEHYTDELAEVALSRFSLIMQAKFSKIF--KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A+ + + ++ N S+ Q+ ++ 

Sbjct: 122 YNNDMSFPTTPTLELFAAEIJ\£LK- EI IS V1IQNAQKTPVLIRAOTNNQLSLKQVYNQYEG 180 

Query 203 GAPFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 

A p + ++D ++ + V+ L K . N E+- +LGI + ++K+ 
Sbjct: 181 NAPVIFAHEALDSDSIEVFKTDAPYWDKLNAQKNAVWN EMMTFLGIKNANLEKKER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR- EPITFLSKRYGLDIK 300 

+ +E SN S+ ++LK R E + + + YGLD+K 

Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLDVK 277 

>gi| 137914 |sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE PROTEIN GP10) >gi | 75852 | pir| | WMBPC9 gene 
10 protein - phage phi-29 >gi| 215328 (M14782) upper 
collar protein [Bacteriophage phi-29] >gi| 215340 

(M12456) plO connector protein (Bacteriophage phi-29] 
>gi|22416l|prf | (1011232A protein plO , connector 

(Bacteriophage phi-29] >gi | 225365 | prf | | 1301270E gene 10 

(Bacteriophage phi-29] 
Length » 309 

Score = 41.4 bits (95), Expect =0.009 . 
Identities = 37/160 (23%), Positives = 75/160 (46%), Gaps = 13/160 (8%) 

Query: 150 YNSDIEI IEHYTDELAEVALSRFSLIMQAKFSKIF- -KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE + S+ A+ + + ++ N S + Q+ ++ 

Sbjct: 122 YNNDMA F PTT PT LE L F AAE LAE L K - E 1 1 S VNQNAQ KT P VL I RAND NNQ LS LKQ VYNQ YEG 180 

Query- 203 GAPFVKMSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 
" W ' A p + ++D ++ + V+ L K N E+ +LGI ♦ ++K+ 
Sbjct: 181 NAPVIFAHEALDSDSIEVFKTDAPYVVDKLNAQKNAVWN EMMTFLGIKNANLEKKER 237 

Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR -EPITFLSKRYGLDIK 30O 

+ +E SN S+ ++LK R E ' + + + YGL++K 

Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLNVK 277 

Query= pt|ll0878 44AHJDORF008 Phage 44AHJD ORF |3020-3775|2 1 
(251 letters) 

>gi|4982468|gb|AAD30963.2| (AF118151) SNFl/AMP- activated kinase 
(Dictyostelium discoideum] 
Length =718 

Score =52.3 bits (123), Expect = 3e-06 

Identities = 28/118 (23%), Positives = 56/118 (46%), Gaps = 5/118 (4%) 

Query 121 YLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGrfTANRNAYV SLPQSEVNIDVDN 176 

+ + GF N++SN+ +NN + Nf T N N ♦ ++ + +N + +N 
Sbjct: 382 FTTTTGFNPTNSNSISNNNNNNNNNNNNTTNNNNN^ 441 

Query: 177 TTLRF ADNNT I D NG KTVNKS S N E SNQN AKRNQ NQKGNAKGTQ FT KQ Y L I D - N I D KA YD 233 

I+.N N ++N +N N N N N+ T+ + I N++ +Y+ 

Sbjct: 442 NNNNINNNNIINNNNNNNNNNNNNNNNNNNNNNN^ 

Score = 37.5 bits (85), Expect = 0.094 

Identities = 17/111 (15%), Positives » 45/111 (40%) 

Query: 130 HNE DTT S NTD ET S NQN AT S LD N S TGMT ANRNA YVS L PQ S E VN I DVDNTTLRF ADNNT I DN 189 

+N + +N + +N N + +N+* + P ♦ + ++♦ N+ ++ 

sb j ct: 456 nnnnnnNNNNNNNNNITNNN^^ 515 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKKILN 240 

N +N +N N N N N ID+++ ♦ + ♦ N 

Sbjct: 516 NNNTNNDNNNNNNNNNNNNNNNNNNNNNNNNNl^ 566 
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Score » 32.8 bits (73), Expect - 2.4 

Identities = 31/140 (22%), Positives » 57/140 (40%), Gaps = 14/140 (10%) 

Query: 109 LNWYS S S E VE KY LQS QG FT E HN E DTT S NTD ET S NQN AT S LDN STGMT ANRN A YV S L 165 

LN Y+S+ S N VT + N++NN + +N+ N N + 

Sbjct: 494 LNNSYNSNSSGNSNGSNSN^SNNNTNNDNNNNNN^ 553 

Query : 166 PQSEVN- -IDVDNTTLRFADNNTIDNGKTVNKSS NESNQNAKRNQNQKGNAK 215 

+ +N DV+N+ + +NN D+G N ++ N N ♦ N GN 

Sbjct: 554 VNNSI^EbTOVNNSNINNNNNNNSDDGSN^ 613 

r. 

Query: 216 GTQ FTKQ YL I DN I D KA YD LR 235 

Q L++++D D++ 
Sbjct: 614 NLNNNFQ-LLNSLDLNSDIQ 632 

Score = 31.7 bits (70), Expect = 5.4 

Identities = 2S/115 (21%), Positives = 48/115 (41%), Gaps = 10/115 (8%) 

Query 130 HNEDTTSNTDETSNQNATSLDNST GMTAN- RNAYVSLPQSEVNIDVDNTTLRFADNN 185 

* ' ~ +N + +N + +N N +S + T N N+Y S S N + N+ +N 

Sbjct: 462 NNNNNNNNNNNNNNNNNSSISGGTEVFS 519 

Query: 186 TIDNGKTVIJKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLR^ILN 240 

DN N ++N +N N N N N + D+ +N 

Sbjct: 520 NNDN NNtmNNNNNNNNNNNNNNNNNNNNW^ 

Score = 31.7 bits (70), Expect * 5.4 

Identities = 15/104 (14%), Positives = 43/104 (40%) 

Query: 110 NVVYSSSEVEKYLQSQGETEHNEDTTS^^^DETSNQNATSLDNSTG^f^ANR^^AYVSLPQSE 169 

N+ +++ + + +N + +N + +N N + +N+ + + V 

Sbjct: 434 NINNNNNNNNNNINNNNIINNNNNNNNNNNNNN^ 493 

Query: 170 VN I D VDNTT LRF ADNNT I DNG KTVNKS S N E S NQNAKRN QN Q KGN 213 

+N + + + ++ + +N N +++ +N N N N N 
Sbjct: 494 LNNSYNSNSSGNSNGSNSNNNSNNNTNNDNN^ S3 7 

Score =30.9 bits (68), Expect = 9.2 

Identities = 16/84 (19%) , Positives « 34/84 (40%) 

Query 130 HNEDTTSNTDETSNQNATSUWSTGKTAN^ 189 

+ N + +N + +N N + +N+ ♦ S+ + N N++ +N+ +N 

sbjct: 455 NNNNNNNNNNNNNNNNNNNNNNNNSS ISGGTEVFSISPNLNNSYNSNSSGNSNGSNSNNN 514 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGN 213 

+ N +N N N N N 
Sbjct: 515 SNNNTNNDNNNNNNNNNNNNNNNN 538 

>gi| 1730077 |sp|P18160|KYKl_DICDI NON-RECEPTOR TYROSINE KINASE SPORE 
LYSIS A (TYROSINE- PROTEIN KINASE 1) >gi| 974334 (U32174) 
non-receptor tyrosine kinase (Dictyostelium discoideum] 
Length = 1584 

Score =» 46.5 bits (108), Expect » 2e-04 

Identities = 29/106 (27%), Positives » 48/106 (44%), Gaps = 4/106 (3%) 

Query 130 HNEDTTSNTDETSNQNATS LDNSTGMTANRNAYVSL PQSEVN ID VDNTTLRFADN - N 185 

+NED +SN ■*• +N N ♦ +N+ N N + + N > ++NTT N N 

Sbjct: 442 NNEDISSNNNNNNNNNNNNNNNNNNNN^ 501 

Query: 186 TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKA 231 

+N N +SN +N N N N N TK+ I ♦ D++ 

Sbjct: 502 NNNNNNNSNSNSNSNNNNINNNNNNNNNNNNI YLTKKPS IGSTDES 547 

Score = 34.0 bits (76), Expect =1.1 

Identities = 20/117 (17%), Positives = 46/117 (39%) 

Query 87 tmQTVEAFGMQVITVCITOEDYIJJVVYSSSEV^ 146 
N G IT T ++++++ + +N + +N + +N N 
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Sbjct: 415 NNNNNNIIGNGKITTTTTTSTSPSSINNNE 474 

Query: 14 7 TSLDNSTGOTANRNAYVSLPQSEVNIDVDNTTLRFADNlTriDNGKTVNKSSNESNQN 203 

+ ++++ X N N ♦ + N + +N N+ +N N + +N +N N 

Sbjct: 475 NNNNSNSSNTNNNNINNTTN^ S31 

Score =33.2 bits (74), Expect * 1.8 

Identities = 18/88 (20%), Positives =» 35/88 (39%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTG^ANRNAWSLPQSEWIDVDtTITLRFADNNTIDN 18 9 

+N + ++N + +N N T T + S+ +E +N +NN +N 

Sbjct: 405 NNNNNSNNNN>rtttlNNNII^^ 464 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

N + +N +N N+ + NT 
Sbjct: 465 NNNNNNNNNNNNNNSNS SNTNNNN I NNT 492 

Score = 32.5 bits (72), Expect = 3 . 1 

Identities = 18/94 (19%), Positives * 37/94 (39%) 

Query: 120 KVI^SQGFTEHNEDTTSbTTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTL 179 

K + S N + +N++. +N N ++ + *T S N D+ + 

Sbjct: 392 KNVNSTS I LVPNGNNNNNSNNNNNNNNNN I IGNGKI TTTTTTSTS PSS INNNED I S SNNN 451 

Query: 180 RFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

+NN +N N + +N +N N + ♦ N 
Sbjct: 452 NNNNNNNNNNNfcnTNNNNNN 485 

Score « 32.5 bits (72), Expect =3.1 

Identities = 24/110 (21%), Positives. = 44/110 (39%), Gaps = 10/110 (9%) 

Query: 138 TDETSNQNATSU)NSTG^^^A^^WAWSLPQSEWIDVDNTTLRFADKNTIDNGK 191 

X X++ + +S++N+ +++N N + + N ♦ +N +NN N 

Sbjct: 429 XTTTTSTS PSS INNNED IS SNNNNNNNNNNNNNN^ 488 

Query: 192 TVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKK 237 

X N +SN +N N N N N+ +N > L KK 

Sbjct: 489 INNTTNNNNSNSNNNNNNNNSNSNSNSNIJNNINNNNNNNNNN^ 538 

>gi|37588S5|emb|CAB11140.l| (Z98551) predicted using hexExon; 

MAL3P6.11 (PFC0760C), Hypothetical protein, leh: 3395 aa 
(Plasmodium falciparum] 
Length =* 3394 

Score * 46. S bits (108) , Expect = 2e-04 

Identities = 52/202 (25%), Positives = 96/202 (46%), Gaps = 32/202 (15%) 

Query: 21 FNEFVNDNKLTFYDDEFQFMQKMLKFD-KDVTA^ 77 

F ♦+ K T D+ M+K K D DV + NEK+ + L + + KK 

Sbjct: 665 FEKYCSNIKNTLIRDD MKKFRKPDISDVHILHNEKIYLEKLLNEKLNYIKDIEKKLD 721 

Query- 78 TIHFLDREINRQTVEAFGMQV ITVCITHEDYLNWYSSSEVEKYLQSQGFTEHNE 132 

+H + IN+ + + +QV , IV + DY' + S + + K ♦ +N 
Sbjct: 722 ELHGV INKNKEDIYILQV^KQTLIKVISSVYDYTKME-SENHIFKMNTIWKMLNNV 777 

Query: 133 DTTSNTDETSNQNATS LDNSTG^1TANRNAYVS LPQS EVNI DVDNTTLRFADNNTIDNGKT 192 

+SN D +NQN ++4-N+ + N+N N +++N + N +N 

Sbjct: 778 HMSSNKDY- NNQNNQN I ENNQN I ENNQN NQNIEN -NQNIENNQNN 820 

Query: 193 VNKSSNESNQNAKRNQNQKGNA 214 

N +N++NQN + NQN + NA 
Sbjct: 821 QNNQNNQNNQNNQNNQNNQNNA 842 

Score = 33.6 bits (75), Expect = 1.4 

Identities = 46/221 (20%), Positives » 89/221 (39%), Gaps = 37/221 (16%) 

Query 10 DFIKSELIKKGFNEFVTTONKLTFYDDEFQFMQK^KTO 69 

D +K E K N ♦ +L Y + + M+K K + V K SL 

Sbjct: 367 DS LKIEYNKSKTN I QQLNEQLWYKNF I KEMEKKYK QLWKNNSLFSITH 416 

Query: 70 DLLFKKSFTIHFLDREINRQTVEAFGMQVITVCITH EDYLNWYSSSEVEKYLQSQG 126 
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D+K+I+R+.+ + + ♦ + I H +D+L+V+Y + + L + 
Sbjct: 417 DFINLKNSNIIIIRRTSDMKQI FKMYNLDIEHFNEQDHLSVIY IYEILYNTN 468 

Query: 127 FTEHNEDTTSNTDETSNQNATS LDNSTGrfTANRNAYVS LPQSEVNIDVDNTTLRFADNNT 186 

+N D ' +N D +N N + +N+ N N N + +N + 
Sbjct: 469 -DmWNU^W^NNNNNNNNND NNNYNNIMM M 512 

Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDN 227 

I+N + N +++ N+NN+N +++Y I+N 
Sbjct: 513 I ENMNSGNHPNSNNLHNYRHNTNDENNLSSLKTS FRYKINN 553 



Score =32.8 bits (73) , Expect =2.4 

Identities = 28/122 (22%) , Positives = 53/122 (42%), Gaps = 2/122 (1%) 

Query: 119 EKYLQSQGFTEHNEDTTSbrTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID-VDNT 177 

E Y S + +++ N + +N + + DN+ N N ++ +N D ++N 

Sbjct: 2838 ENYPVSTHYDNNDDINKBNINNDNNNDNI^ 2897 

Query: 178 TLRFAD^IDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKK 237 

+N+ +NG SSN ++ N N N K N +G + + + + YD K 

Sbjct: 2898 NNNDNNNDNSNNGFVCELSSNINDFNNILNW-KDNFQGIN^ 2956 

Query: 238 IL 239 
' 1 + 

Sbjct: 2957 IV 2958 
Score =32.5 bits (72), Expect =3.1 

Identities = 46/249 (18%), Positives = 101/249 (40%), Gaps = 31/249 (12%) 

Query: 9 YDFIK^ELIKKGFNEFVNDNKLTFYDDEFQFMQKMLKFDKDVTJVIVNEKVFKGFSLKI)EL 68 

Y+++K >+ N N NK E Q++ K+ + > + +E K L+ + 

Sbjct: 2150 YNYVK VQNATNREDNKNK - -ERNLSQEIYKYINENIDLTSELEKKNDMLENYK 2200 

Query- 6 9 SDL lFKKSFTIHFUDREI^QTVEAFGMQ * 22 

++ L ++K + I L + M+ + + N + E+ + L 

Sbjct: 2201 NELKEKNEEI YKLNNDIDMLSNNCKKLKES IMMMEKYKI IMN NNIQEKDEIIENL 2255 

Query* 123 QSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTAN RNAYVSLPQSE VNIDV 174 

+++ + +D +N + ++S M+ + N + +L +S N+D+ 

Sbjct: 2256 KNK-YNNKLDDLINNYSWDKSIVSCFEDSNIMSPSCNDILNVFNNLSKSNKKVC^^ 2314 

Query 175 DNTTLRFADNKTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDL 234 

U + ++I+N +N +N +N N N N N K YL++N+ D 

Sbjct: 2315 CNENMDSI--SSINNVNNINNVNNIt^ 2372 

Query: 235 RKKILNEFD 243 
1+ +F+ 

Sbjct: 2373 DNIIIIKFN 2381 
Score = 32.1 bits (71), Expect = 4.1 

Identities = 20/103 (19%), Positives = 48/103 (46%), Gaps = 2/103 (1%) 

Query 115 S SEVEKYLQSQGFTEHNEDTTSNTDETSNQN - - ATS LDNSTGMTANRNAYVS LPQS EVNI 172 

+++ £KY EH + N D +N+N L +> ++ ♦ N S ++E+ 

Sbjct: 3264 NNDEEKYSCHDDKNEHTNNDLLNIDHDNNKNNI 3323 

Query: 173 DVDtrTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAK 215 

+ + D N ++ N ++E+++N ♦ ++N ♦ + K 
Sbjct: 3324 LISIDSSNENDENDENDENDENDENDENDENDENDENDENTDEK 3366 

Score = 30.9 bits (68), Expect = 9.2 M ,n 
Identities = 27/118 (22%), Positives = 53/118 (44%), Gaps = 15/118 (12%) 

Query 104 THEDYLNWYSSSEV EKYLQSQGFTEHNEDTTSNTOETSNQNATSLDNSTGMTANR 159 

T+ D LN+ + +++ E Y HN+D + ♦ +E QN S+D+S N 

Sbjct: 3280 TNITOLLNIDHDNNKNNITDELYSTYNVSVSHNKBPSNKE 3337 

Query ' 160 NA YVS LPQS E VN I D VDNTT LRF ADNNT I DNGKTVNKS S NES NQNAKRNQNQ KGNAKGT 217 

+++ N + D D N + + N +E+*+N + ++N N +GT 

Sbjct- 3338 EN D END END END EN DENDENDENDENDEKDENDENDENDENFDNNNEGT 3386 
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>gi|585795|sp|P21S38|REBl_YEAST DNA-BINDING PROTEIN REB1 (QBP) 

>gi| 626X39 | pir| |S45907 DNA-binding protein REB1 - yeast 
(Saccharomyces cerevisiae) >gi| 53S280 | emb| CAA84992 | 
(Z35918) ORF YBR049C [Saccharomyces cerevisiae) 
>gi|S59944|emb|CAA8639l| (Z46260) REB1 DNA-binding 
protein (Saccharomyces cerevisiae] 
Length * 810 

Score » 45.7 bits (106), Expect = 3e-04 

Identities = 34/158 (21%), Positives = 72/158 (45%), Gaps * 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYLQSQGFTEHNEDTTSNTDETS 142 

D+ N+++VE ♦+ + V + H+++ +++ K+ + Q E + D N ++ S 

Sbjct: 7 DKNANQESVEEAVLKYVGVGLDHQNHDPQLHTKDL 66 

Query: 143 NQNAT S LD NSTGMT ANRNAYVS L PQS EVN I D VDNTT LRF ADNNT ID NGKTVNKS SNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +E 

Sbjct: 67 NRNEDNNDDSENISA - -LNANESSSNVDHANSNEQHNAVMDWYLRQTAHNQQDDE 119 

Query: 200 SNQNAKHNQNQ KGNAKGTQFTKQYLI DN I DKAYDLRKK 237 

++N N GN F++ ++ +D D KK 

Sbjct: 120 DDEN- -NNNTDNGNDSNNHFSQSDIV- -VDDDDDKNKK 153 

>gi | 172372 (M58728) DNA-binding protein [Saccharomyces cerevisiae] 
Length » 80 9 

Score » 45.7 bits (106), Expect - 3e-04 

Identities = 34/158 (21%), Positives = 72/158 (45%), Gaps - 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNV^ 142 

D+ N+++VE ++ + V +. H+++ +++ K+ + Q E + D N ++ S 

Sbjct: 7 D KNANQ E S VEEAVLKYVGVGLDHQNHD PQLHTKDLENKHS KKQNI VES SND VDVNNNDDS 66 

Query: 143 NQNATSLDNSTGMTANRNAYVSLPQSEVNIDVlDhnTLRFADNNTID NGKTVNKS SNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +E 

Sbjct: 67 NRNEDNNDDSENISA- LNANES S SNVDHANSNEQHNAVMDWYLRQT AHNQQDD E 119 

Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKX 237 

+ +N N GN F++ ++ +D D KK 

Sbjct: 120 DDEN--NNNTDNGNDSNNHFSQSDIV- -VDDDDDKNKK 153 

>gi | 2 952545 (AF051898) coronin binding protein (Dictyostelium 
discoideum] 
Length - 560 

Score = 44.9 bits (104), Expect =* 6e-04 

Identities = 26/83 (31%), Positives = 39/83 (46%), Gaps = 5/83 (6%) 

Query 131 NEDTTSNTDETSNQNATSLDNSTGOTANRNAWSLPQSEW 190 

N + +N +N N+ S +NS +N N+ * P N D DN T +NNT +N 
Sbjct: 404 NNNNNNNIINNNNSNSNSNNNSNN-NSNNNSNRNSPNHNNNGDNDNNT NNOTNNNN 458 

Query: 191 KTVNKSSNESNQNAKRNQNQKGN 213 

N ++N +N N N N N 
Sbjct: 459 NNNNNNNNNNNNNNWINNlfNNNN 481 



Score * 41.4 bits (9S), Expect * 0.006 

Identities = 22/88 (25%), Positives = 43/88 (48%), Gaps =» 6/88 (6%) 

Query: 130 HNEDTTSNTDETSNQNATSLDN STGMTANRNAYVSLPQSEVNIDVDNTTLRFADNNT 186 

+ ++ +N ++ SN N+ + +N + G AN++ + P ♦ +N + DN +NN 
Sbjct: 337 NRNNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NSPNNNliNTNNDNKNNNSNNNNN 393 

Query: 187 IDNGKTVNKS SNESNQNAKRNQNQKGNA 214 

+N S+N +N N N N N+ 

Sbjct: 394 SNNNSNNGNSNNNNNNNI INNNNSNSNS 421 

Score = 40.6 bits (93), Expect * 0.011 

Identities = 24/101 (23%), Positives = 41/101 (39%), Gaps » 2/101 (1%) 

Query 115 SSEVTIKYI^SQGFTEHNEDTTSNTDETSNQNATSI^NSTGMTANRNAYVSLPQSEVNIDV 174 
^ " 1 - L + ++ N +N ++ N S +N+ N N S + N + 



S + 
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Sbjct: 370 SNSPNNNLNTNNDNKNNNSNNNNNSNNN 429 

Query: 175 DNTTLRFADN- -NTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

+N + R + N N DN N ++N +N N N N N 
Sbjct: 430 KNNSNRNS P NHNNNG D ND ^^TNNNTNNNNNN^rNNNNN^^^NN 470 

Score * 40.2 bits (92), Expect * 0.014 

Identities = 21/80 (26%), Positives = 39/80 (48%), Gaps * 9/80 (11%) 

Query 130 HNEDTTSbrTDETSNQNATSLDMSTGMTANRNAWSLPQSEVlIIDvI)>rrrLRFADNNTIDN 189 
" * +N D +NT4- +N N + +N+ N N M • + +N +ADN+ ++ 
Sbjct: 442 ^GDbTONNTNNHTNNNNNNNNNNNN NNNNNNNNNNYADNSNNNS 492 

Query: 190 GKTVNKSSNESNQNAKRNQN 209 

♦ N +SN +N N +N+N 
Sbjct: 4 93 SNSNNNNSNSNNNNDNKNEN 512 

Score = 39.5 bits (90), Expect = 0.024 /109r , 
Identities - 26/111 (23%), Positives = 44/111 (39%) , Gaps « 20/111 (18%) 

Query: 112 VYSSSEVEKYLQSQ- -GFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSE 169 

VY + K+ ++ G +N ++ +N++ SN N + +N N N 
Sbjct: 296 WCTHHHTKFYETHRNGLLNNNNNSNNNSNSNSNN^ 346 

Query 170 VNIDVDNTTLRFADNOTIDNGKTVNKSS NESNQNAKRNQNQKGNA 214 

+ N ++N I NG NKS+ N +N N N N N+ 

Sbjct: 347 NNSNNNSNNSNNRNITNGSNANKSNS PNNNLNTNNDNKNNNSNNNNNS 394 

Score - 37.5 bits (85), Expect = 0.094 

Identities * 24/96 (25%), Positives » 41/96 (42%), Gaps » 1/96 (1%) 

Query 124 SQGFTEHNEDTTSNTDETSNQNATSLD^^STGM-TANRNAWSLPQSE^VNIDv^D^^^TLRFA 182 

S + +N + SN + + + N DN+T TNN + + N + +N 
Sbjct: 421 SNNNSt^SNNNSNRNSPbmNNNGDNDNNTNNNTNNNNNN^ 480 

Query: 183 D NNT I D NG KTVN KS S NE S NQN AKRNQNQ KGN AKGTQ 218 

+NN DN + +SN +N N+ N ♦ K Q 
Sbjct: 481 NNNYADNSNNNSSNSNNNNSNSNNNNDNKNENSDNQ 516 

Score = 35.6 bits (80), Expect = 0.36 

Identities * 25/99 (25%), Positives » 42/99 (42%), Gaps = 18/99 (18%) 

Query 130 HNEDTTSNTDETSNQNATSLDNST-GWTANRNAYVSLPQSE^IDTONTTLRFADNN^ 188 

+N + SN + +N N +♦ N T G AN++ ♦ P ♦ +N + DN , +NN ♦ 

Sbjct: 339 NNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NS PNNNLNTNNDNKNNNSNNNNNSN 395 

Query 189 NGKTV NKSSNESNQNAKRNQNQKGN 213 

N N S++ SN H M N 

Sbjct: 396 NNSNNGNSNNNNNNNI INNNNSNSNSNNNSNNNSNNNSN 434 

Score = 35.2 bits (79), Expect * 0.47 

Identities * 21/94 (22%), Positives » 42/94 (44%), Gaps » 5/94 (5%) 

Query 124 SQG FT EHN E DTT S NTD ET S NQN ATS LDNS TGMT ANRN A WS L PQ S E VN I D VDNTTLRF AD 183 

+ G + +N T+N N + N+ N N+ + N + +N + + 

Sbjct: 362 TNGSNANKSNS PNNNLNTNNDNKNNNSNN -NNNSNNNSNNGNSNNNNNNNIINNNN 416 

Query: 184 NNTIDNGKTVNKSSNESNQNAXRNQNQKGNAKGT 217 

+N+ N + N S+N SN+N+ + N N T 
Sbjct: 417 SNSNSNNNSNNNSNNNSNRNS PNHNNNGDNDNNT 450 

Score =* 35.2 bits (79), Expect = 0.47 

Identities = 29/118 (24%), Positives = 53/118 (44%), Gaps = 12/118 (10%) 

Query 115 SSEVTSK^UaS-QGFTEHNEDTTSNTDETSNQNATSLD^^ 173 

SS+ E +GF ♦ + T+N ++N D S+G + + + V+ P+S +N 

Sbjct: 114 SSDSEADIEDDKGFQD- -KPITTNNSGSNNPLKNLKDYSSGSSGSSRSGVNQPRSNINNS 171 

Ouerv- 174 VDNTTLRFADNNT IDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQ 222 

^ y * D + ♦ + N+ I + T + NQN +NQNQ N Q *Q 
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Sbjct: 172 NDKYKS KSSS SNSNSS S SGGS LI SS LLTGGNTYQNQNQNQNQNQNQNNNQSQLQQQQQ 229 
Score = 34.4 bits (77), Expect » 0.81 

Identities = 24/94 (25%), Positives = 38/94 (39%), Gaps = 12/94 (12%) 

Query 131 N EDTT S NTD ET SNQN AT S LD N S TGMT ANRN A YVS L PQ S E VN I D VDNTT LR F AD NNT I DNG 190 

N +T +N + +N N + +N+ N N S N N +NN+ N 

Sbjct: 451 NNNTNNNNNNNNNNNNNNNNNNl^^ NNNSNSNN 504 

Query: 191 KTVNKSSNESNQNAKR NQNQKGNAKGTQ 218 

NK+ N NQ+ R ++NQK + Q 

Sbjct: 505 NNDNKNENSDNQSVLRSNEKFTDENQKNGSDDQQ 538 

Score = 33.6 bits (75), Expect = 1.4 

Identities = 22/90 (24%), Positives = 35/90 (38%) 

Query: 124 SQGFTEHNEDTTS^DETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVT)tnTUlF^ 183 

S N SN + + +++ N N+ NN + +N + +N 

Sbjct: 353 SNNSNNRNITNGSNANKSNSPNNNLNTNNDNKKNNSNNT^^ 412 

Query: 184 NOTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

NN N + N S+N SN N+ RN N 
Sbjct: 413 NNNNSNSNSNNNSNNNSNNNSNRNSPNHNN 442 

>gi| 535260 |emb|CAA82996| (230339) STARP antigen [Plasmodium 
reichenowi] 
Length =655 

Score =44.5 bits (103), Expect - 7e-04 

Identities - 31/114 (27%), Positives - 47/114 (41%), Gaps =14/114 (12%) 

Query: 128 TEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVN ID VDNTT LRF 181 

T++N T TD + + +N+T A N + + + N D +NT + 

Sbjct: 433 TDNNNTNTKATDSNNTNTKATDNNl^ 492 

Query- 182 AD NNT I DNG KTVN KS S NE S NQNAKRNQNQ KGNAKGT Q FTKQYL I DN 227 

DNN DN T K+++ +N N K N N K T T QY+ N 

Sbjct: 4 93 TDNTtfNTNTKATDNNNTNTKATDN^ 546 

Score - 44.5 bits (103), Expect = 7e-04 

Identities = 30/103 (29%), Positives - 44/103 (42%), Gaps = 13/103 (12%) 

Query. 12 8 TEHNEDTTSNTDETSNQNATSLDNS TGMTANRNAYVSLPQSEVN ID VDNTT L 179 

T++N T TD+++N + + DN+ T T N N S D +NT 

Sbjct: 401 TDNNNTDTKATDKSNOTDTKATDNNNNTO^ 460 

Query- 180 RFADNNTI -DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

♦ DNN DN T K+++ +N N K N N K T 

Sbjct: 461 KATDNNNTNT KATDNNNTNT KAT DNNNTNT KATDN^^ 503 

Score =42.6 bits (98), Expect = 0.003 

Identities * 27/96 (28%), Positives = 43/96 (44%), Gaps = 10/96 (10%) 

Query- 128 TEHN E DTTS NTD ETSNQN AT S LD - NSTGMTANRN A YVS L PQS EVN I D VDNTT LRF ADNNT 186 

T+ +N +T + + +N N + D N+T A N ♦ +♦ N NT + DNN 
Sbjct: 422 TDNNNNTDTKATDNNNTNTKATDSNNTNTKATDNNNTNTKAT^ NTNTKATDNNN 477 

Query: 187 I DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DN T K+++ +N N K N N K T 
Sbjct: 478 TNT KATDNNNTNTKATDNNNTNTKATDNNNTNTKAT 513 

Score = 41.8 bits (96), Expect =» 0.005 

Identities » 35/150 (23%), Positives = 59/150 (39%), Gaps = 9/150 (6%) 

Query- 85 EINRQTVEAFGMQVITVCITHEDYliNVVYSSSEVEKYLQSQGFTEHNE * 44 

E N+ ++ G T+ + N + E + +Q T +N TT+ + N 

Sbjct: 118 ETNKTNIKLTGNNSTTINTNLTENTNA- -TKKLTENVITNQILTGNNNTTTNTSSTEHNN 175 

Query- 145 NAT S LDNSTGMTANRN A YVS LPQS EVN I DVDNTTLR FADNNT I DNG KTVNKS S NE SNQNA 204 
N + NSTG T+ NI + N L +N T + T + ♦+ +N N+ 
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Sbjct: 176 NINTNTNSTGNTSTTKKLTE NI-ITNQILTGNNNTTTNTSSTEHNNNINTNTNS 228 

Query: 205 KRNQNQKGNAKGTQFTKQYLIDNIDKAYDL 234 

N N N T + DNI + - +L 

Sbjct: 229 TDNSNTNTNLTDITTTTKKWTDNINTTQNL 258 

Score = 41. 8. bits (96), Expect » 0.005 M ?*i 
Identities . 30/101 (29%). Positives = 43/101 (41%), Gaps = 13/101 (12%) 

Ouerv 130 HNEDTTSNTDETSNQNATSLDNS-TGMTANRNAYVSLPQSEVNIDV DNTTLRFA 182 

+N DT S ++ ++ AT DN+ T T N N ♦ M ♦ 

Sbjct: 363 NNTDTISTDNDtTTDTKATDNDNTDTKATDNNNNT^^ 422 

Query 183 DNN TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN DN T K+++ +N N K N N K T 

Sbjct: 423 DNNNNTDT KATD NNNTNT KATD S NNTNT KATD NNNTNT KAT 463 

Score = 40.6 bits (93), Expect = 0.011 

Identities = 31/121 (25%), Positives . 47/121 (38%), Gaps = 31/121 (25%) 

Query: 128 TEHNEDTTSNTDETSNQNAT- - — -SLDNSTGMTAKRNAYVSLPQSEVN 171 

TEHN + +NT+ T N + T ++++TNN + +EN 

Sbjct: 171 TEHNNNINTNTNSTGOTSTTKKLTENIITO^^ 230 

rv , OT ^. l73 IDVDNTTLRFADN NT I DNG KTVN KS S NE S NQN AKRNQNQ KGNAKG 216 

QUery * D+ TT ++ DN T N TV> +N +N N K N N K 

Sbjct: 231 N SNTNTN LTD I TTTT KKWTDN 1 NTTQNLTT S TNTTTVS TDNNNNN I NT KPTDNNNTN I KS 290 

Query: 217 T 217 . 
T 

Sbjct: 291 T 291 
Score = 38.3 bits (87), Expect - 0.055 

Identities = 28/98 (28%), Positives = 41/98 (41%), Gaps = 10/98 (10%) 

Query- 128 TEHNEDTTSNTDETSNQNATSUDNSTGMTANRNAWSLPQSEW 186 

TEHN + +NT+ s N+ + N T +T + + N+ NTT DNN 

Sbjct- 216 TEHNNNINTNTN--STDNSNTNTNLTDITTTTKKWTDNINTTQNLTTS^^ 273 



Query- 187 - -IDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DN T KS++ N K N+ + K T 
Sbjct: 274 NN I NT KPTDNNNTN I KSTDNYNTGTKETDNKNTD I KAT 311 

Score * 37.5 bits (85), Expect = 0.094 , ai ^c tic%\ 

Identities - 31/106 (29%), Positives = 45/106 (42%), Gaps = 18/106 (16%) 

Query 128 TEHNEDTTSNTDETSNQN ATSLDNSTGMTANRNAYVS LPQS EVN IDVDN 176 

T++N +T +T T N N AT NVT A N + ++ N D +N 

Sbjct: 390 TDNNNNT - -DTKATDNNNTDTKATDKSNNTDTKATDNNNNTDTKATDNNNTNTKATDSNN 447 

Ouerv- 177 TTLRFADNN T I DNG KTVNKS SNE SNQNAKRNQNQKGNAKGT 217 

T + DNN DN T K+++ +N N K N N K T 

Sbjcf 448 TNT KAT D NNNTNT KATD NKNTNT KATD NNNTNT KATD 493 



Score = 35.2 bits (79), Expect = 0.47 

Identities * 24/109 (22%), Positives - 46/109 (42%), Gaps = 6/109 (5%) 

Query 128 TEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVN IDVDNTTLRE 

v 1 ' ~ n A ^ a m a t AM + >+ N D +NT + 



VUCiy ' T++N T to + + + N + T AN + N D +NT + 

Sbjct: 473 TD NNNTNT KATDNNNTNT KATD NNNTNT KATD N^^ 532 

Query 182 ADNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDK 230 

DNN N + +E+ + K N++ N++ ♦ K + +DK 

Sbjct: 533 TDNNNNTNQ YVF ANNYD ETTS DD KLNKD S CDNS E E KEN I KS M I N AYLD K 581 

Score = 34.4 bits (77), Expect » 0.81 

Identities - 26/126 (20%) Positives = 46/126 (35%). Gaps = 7/126 (5%) 
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Query: 99 ITVCITHEDYLNVVYSSSEVEKYI^SQGFT.EHNEDTTSNTDETSNQNATSLDNSTGMTAN 158 

IT T+ + ++ S ♦ V ST +++ +N T N N + + T 

Sbjct: 318 I TTDNTNTNV I STDNS KTNV I S KDNSNTHT I STDNS KTNV I STDNNNTDT I STDNDNTDT 377 

Query: 159 RNAYVS L PQS EVN I DVD NTT LRFADNNT I D NGKTVNKSSNESNQNAKRNQNQK 211 

+ ++ + +NT ♦ DNN D N +N+N + KN 

Sbjct: 378 KATDNDNTDTKATDNNNNTDT KATDNNNTDT KATD KSNNTDTKATDNNNNTDT KATD NNN 437 

Query: 212 GNAKGT 217 
N K T 

Sbjct: 438 TNTKAT 443 



Score =34.4 bits (77), Expect * 0.81 

Identities = 30/100 (30%), Positives = 44/100 (44%), Gaps = 14/100 (14%) 

Query: 131 NEDTTSNTDETSNQNATSLDNS-TGMTANRNAY VSLPQSEVNI DVDNTTLRFAD 183 

N+TTDTNNS DNS T + + N+ +S S+ N+ D +NT D 
Sbjct: 313 NNNITITTDNT-NTNVISTDNSKTNVISKDNSOT^ 371 

Query: 184 NNT I DNG KTVNKS S NESNQNAKRNQNQKGNAKGT 217 

N+DTN++ N+N+KN+KT 
Sbjct: 372 NDOTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKAT 411 



Score * 34.4 bits (77), Expect =0.81 

Identities = 28/101 (27%), Positives = 41/101 (39%) , Gaps = 15/101 (14%) 

Query: 131 NEDTTSNTDETSNQNATSLDNSTGMTA- -NRNAYVSLPQSEVNIDV DNTTLRFA 182 

N DT + ++ ++ AT +N+T ANN N D +NT ♦ 

Sbjct: 374 NTDT KATDNDNTDT KATDNNNNTDT KATD NNNTDT KATD KSNNTDT KATDNNNNTDTKAT 433 

Query: 183 DNNTIDNGK TVNKS SNESNQNAKRNQNQKGNAKGT 217 

DNN N K T K+ + + +N N K N N K T 

Sbjct: 434 DNNN-TNTKATDSNNTNTKATDNNNTNTKATDNNNTNTKAT 473 



- Score =32.5 bits (72), Expect =3.1 

Identities = 30/110 (27%), Positives = 40/110 (36%), Gaps = 23/110 (20%) 

Query: 131 NEDTTSNTD ETSNQN ATS LDN S TGMTANRNAYVSLPQS EVN I DVDNTTLRF 181 

N +TT N ++N S DN+ TTNN+ + DNT++ 

Sbjct: 251 NINTTQNLTTSTNTTTVSTDNNNNNINTKPTDNNNTNIKSTDNYNTGTK^ 310 

Query: 182 ADNNTI DNGKTVNKS SNESNQNAKRNQNQ KGNAKGT 217 

DNN I DNKT S + SN + N K N T 

Sbjct: 311 TDNNNITITTDtmnWISTDNSKTNVISKDNSNTHTISTDNSKTNVIST 360 

>gi|l429240|erab|CAA67659| (X99260) lower collar protein 
[Bacteriophage B103] 
Length =293 

Score » 43.8 bits (101), Expect = 0.001 

Identities = 53/204 (25%), Positives = 79/204 (37%), Gaps = 42/204 (20%) 

Query: 56 EKVFKG FSLKDELSDLLFKKSFTIHFLD RE I NRQTVEAFGMQ V I TVC I THED 107 

EK+ KG F + + D ++K F HF+ REI +T F + T I + 

Sbjct: 26 EKIEKGRPKLFDFQYPIFDESYRKVFETHFIRNFYMREIGFETEGLFKFNLETWLIINMP 85 

Query: 108 YLNWYSSSEVEKY LQSQGFTEH NEDTT SNTDETSNQNA 146 

Y N ++ S E+ KY L + G +♦ N DTT SNT + NA 

Sbjct: 86 YFNKLFES-ELIKYDPLENTRLNTTGNKN^TERNDNRDTTGSMKADGKSNTKTSDKTNA 144 

Query: 147 TSLDNSTGMTA NRNAYVSLPQSEVNIDVDN- -TTLRFADNNTIDNGKTVNKS 196 

T G T NR PS +N+ ++ TL +A + 1+ T NK 

Sbjct: 14 5 TGS S KEDGKTTGS VTDDN FNRKI DS DQ PD S RliNLTTNDGQGTLE YA - - S A I E ENNTNNKR 202 

Query: 197 S N E S NQNAKRNQNQ KGN AKGTQ FT 220 

+ H + ♦ . GT T 

Sbjct: 203 NTTGTNNVTSSAESESTGSGTSDT 226 

Query= pt| 110879 44AHJDORF009 Phage 44AHJD ORF 1 5744-6496) 2 1 
(250 letters) 
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>gi| 2764981 | emb|CAA69021.l| (Y07739) N-acetylmuramoyl -L-alanine 
amidase {Staphylococcus phage TwortJ 
Length =» 4 67 

Score * 180 bits (452), Expect = le-44 

Identities = 89/157 (56%), Positives =» 109/157 (68%), Gaps = 8/157 (5%) 

Query: 1 MKSQQQAKEWIYKHEGAGVI)FDGAYGFQC^LSVAYVYYITIX3KVRWW 60 

MK+ +QA+ +1 G DFDG YG+QCMDL+V Y+Y++TDGK+RMWGNAKDAINN F 

Sbjct: 1 MKTLKQA£SYIKSKVTn , GTDFDGLYGYQCMDL»AVBYIYHVTDGKIRMWGNAKDAINNSFG 60 

Query: 61 GLATVYKNTPSFKPQLGDVAVYTNGQ YGHIQCVLS GNLDYYTCLEQNWLGGGF 113 

G ATVYKN P+F+P+ GDV V+T G YGHI V + G+L Y T LEQNW G G 
Sbjct: 61 GT ATVYKNY P AF R P KYGD WVWTTGN F AT YGH I A I VTN PD P YGD LQ YVTVL E QNWNGNG I 120 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKFSGSNS-KALETSK 14 9 

E ATIRTH Y G+THFIRP F+ +S K +T K 
Sbjct: 121 YKTELATIRTHDYTGITHFIRPNFATESSVKKKDTKK 157 

Score = 61.7 bits (147), Expect = 6e-09 

Identities = 41/125 (32%), Positives = 57/125 (44%), Gaps = 8/125 (6%) 

Query: 125 YYDGvTHFIRPKFSGSNSKALETSKV^FGKWKiWQYGTYYRNENGTFTC-GFLPIFARV 183 

YY+G T P +K + +T G W . N YGTYY++E+ TF C I R 

Sbjct: 346 YYEGKTPV- -PTVVNQKAKTKPVXQSSTSG-WNVWYGTYYKSESATFKCTARQGIVTRY 402 

Query: 184 GSPKLSEPNGYWFQPNGYTPYNEVCLSEJGYVWIGYNWQGTR-YYLPVRQWNGKTGNSYSV 242 

p + p Y+ VC DGYVWI + G + ++PVR W+ N+ ♦ 
Sbjct: 403 TGPFTTCPQAGVXYYGQSVTYDTVCKQDGYWISWTTO KNTDIM 459 

Query: 243 GIPWG 247 

G WG 
Sbjct: 460 GQLWG 464 

>gi|H3675| sp | P24S56 | ALYS_STAAU AUTOLYSIN 

(N-ACETYLMURAMOYL-L- ALANINE AMIDASE) 

>gi| 79887|pir| | JQ1147 N-acetylmuramoyl-L- alanine amidase 
(EC 3.5.1.28) - Staphylococcus aureus >gi| 153067 
(M76714) peptidoglycan hydrolase [Staphylococcus aureus) 
Length * 4 81 

Score = 118 bits (292), Expect =» 6e-26 

Identities = 56/117 (47%), Positives = 68/117 (57%), Gaps » 1/117 (0%) 

Query: 135 PKFSGSNSKALETSKVNTFGK-WKRNQYGTYYRNENGTFTCGFLPIFARVGSPKLSEPNG 193 

P + SN + ++ V WKRN+YGTYY Ef ,FT G PI R P LS P G 

Sbjct: 365 P VATVSNES S AS SNTVKP VAS AWKRNKYGTYYMEESARFTNGNQ P I TVRKVG PFLSCP VG 424 

Query: 194 YWFQPNGYTPYNE^CLSDGYWIGYNWQGTRYYLPVRQWNGKTGNSYSVGIPWGVFS 250 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQFQ PGG YCD YT EVMLQDGHVWVG YTWEGQR YYL P I RTWNGS A P PNQ I LGD LWGE I S 481 

Score = 78.0 bits (189), Expect = 7e-14 

Identities = 48/109 (44%), Positives = 62/109 (56%), Gaps * 6/109 (5%) 

Query: 15 EG AG VD FDG A YG FQCMD LS V A YVYY I TDG KVRMWGNAKD A - 1 NND F KG LATVYKNT PS F K 73 

EG ♦ D YGFQC D ♦ A + +. G ♦ AKD N+F GLATVY+NTP F 

Sbjct: 18 EGKQFNVDLWYGFQCFDYANAG-WKVTiFGLLLKGLGAKDIPFANNFDGLATVYQNTPDFL 76 

Query: 74 PQLGDVAVYTNGQ YGHIQCVLSGNLDYYTCLEQNWLGGGF-DGWEK 118 

Q GD+ V+ + -YGH+ V+ LDY EQNWLGGG+ DG E+ 
Sbjct: 77 AQPGDMWFGSNYGAGYGHVAWVI EAT LDY I IVYEQNWLGGGWTDG I EQ 125 



>gi| 1763243 (U72397) amidase (bacteriophage 80 alpha) 
Length =4 81 

Score = 118 bits (292), Expect = 6e-26 

Identities = 56/117 (47%), Positives = 68/117 (57%), Gaps » 1/117 (0%) 

Query: 135 P KF S G SNS KALET S KVNT FG K - WKRNQ YGTYYRNENGTFTCG F L P I F ARVG S PKLS E PNG 193 

p 4- SN + ♦+ V WKRN+YGTYY E* FT G PI R P LS P G 

Sbjct: 365 PVAWSNESSASSNTVXPVASAWKRNKYGTYYMEESARFTO 424 
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Query: 194 YWFQPNGYTPYNEVCLSDGYVWIGYNWQGTRYYLPVRQWNGKTGNSYSVGIPWGVFS 2S0 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG > +G WG S 
Sbjct: 425 YQFQPGGYCDYTEVMLQDGHVWGYTWEGQRYYLPIRTWGSAPPNQILGDLWGEIS 481 

Score = 83.5 bits (203), Expect = 2e-15 

Identities = 50/115 (43%), Positives = 65/115 (56%), Gaps * 6/115 (5%) 

Query: 9 EWI YKHEGAGVT)FEX5AYGFQCMDLSVAYvTYITDGICvTlMWGNAXDA- INNDFKGLATVYK 67 

EW+ EG + D YGFQC D + A + + G + AXD N+F GLATVY+ 

Sbjct: 12 EWLKTSEGKQFNVDLWYGFQCFDYANAG -WKVLFGLLLKGLGAKDI PFANNFDGLATVYQ 70 

Query: 68 NTPSFKPQLGDVAVYTNGQ YGHIQCVLSGNLDYYTCLEQNWLGGGF-DGWEK 118 

NTP F Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG E+ 

Sbjct: 71 bTTPDFLAQPGDMWFGSNYGAGYGHVAWVIEATLDYIIVlfEQNWIXKSGWTDGIEQ 125 

>gi|4574237|gb|AAD23962.l|AF106851_l (AF106851) LytN [Staphylococcus 
aureus] 
Length » 383 

Score = 84.3 bits (205), Expect = 9e-16 

Identities = 48/128 (37%), Positives =68/128 (52%), Gaps » 7/128 (5%) 

Query: 15 EGAGVT)FDGAYGFQC^LSVAYVYYITDGKVTttWGN 74 

J ' E G DFDG+YG+QC DL Y ♦+ ++ +G N+F A +Y NTP+FK 

Sbjct: 252 EtniGWDFDGSYGWQCFDLVTTVYVftfflLYGHGLKG 311 

Query: 75 QI/3DVAVYT- - -NGQYGHIQCVLSGNLD YYTCLEQNWLGGGFDGWEKATIRTHYYD 127 

+ GD+ V++ G YGH VL+G+ D + L+QNW GG> E A H Y+ 

Sbjct: 312 E PGDL WFSGRFGGGYGHTAI VXNGD YDGKIJ4KFQS LDQNWNNGGWRXA^^ 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 

>gi|3767593|dbj |BAA33856.l| (AB015195) LytN (Staphylococcus aureus] 
Length = 383 

Score * 84.3 bits (205), Expect - 9e-16 

Identities = 48/128 (37%), Positives - 68/128 (52%), Gaps » 7/128 (5%) 

Query 15 EGAGTOFTCAYGFQCTOLSVAYVm 74 

E G DFDG+YG+QC DL Y ++ ++ +G N+F A +Y NTP+FK 

Sbjct: 252 ENRGVTOFDGSYGWQCFT3LVNVYWNHLYGHGLKGYGAXDIPYANNFNSEAKIYHNTPTFKA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCLEQNWLGGGFDGWEKATIRTHYYD 127 

+ GD+ V+ + G YGH VL+G+ D + L+QNW GG+ E A H Y+ 

Sbjct: 312 EPGDLWFSGRFGGGYGHTAIVLNGDYIXSKXiMKFQSLDQNW 371 

Query: 12 8 GVTHFIRP 135. 

FIRP 

Sbjct: 372 NDMIFIRP 379 

>gi|2764983|emb|CAA69022.l| (Y07740) cell wall hydrolase Plyl87 
(Staphylococcus phage 187] 
Length =» 628 

Score =76.9 bits (186), Expect - 2e-13 

Identities = 50/144 (34%), Positives = 68/144 (46%), Gaps = 18/144 (12%) 

Query 5 QQ AXEW I YKH EGAG VD FDG A YG F QCMD LS VA YVYY I TDG KVRMW GNAKD A I NND F 59 

+ q + w G+GVD DG YG QC DL Y++ R W GNA+D + 

Sbjct: 12 KQWDWAINLIGSGVDVDGYYGRQCWDLP-NYIFN R YWN F KT PGNARDMAWYR Y 64 

Query: 60 KGLATVYKNTPSFKPQLGDVAVYTNGQY GHIQCVLS-GNLDYYTCLEQNWLGGGF 113 

V++NT F P+ GD+AV+T G Y GH V+ Y+ + +QNW 

Sbjct: 65 pEGFKVFROTSDFVPKPGDIAWTGGNYNWNTWGHTGIWGPSTKSYFYSV^ 124 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKF 137 

A H Y GVTHF+RP ♦ 
Sbjct: 125 YVGS PAAKIKHSYFGVTHFVRPAY 148 
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>gi|3287732|sp|005156|ALEl_STACP GLYCYL -GLYCINE ENDOP E PT I DAS E ALE-1 
PRECURSOR >gi|l890068|dbj|BAA13069| (D86328) ALE-1 
(Staphylococcus capitis] 
Length => 362 

Score = 73.4 bits (177), Expect = 2e-12 

Identities = 47/117. (.40%), Positives = 61/117 (51%), Gaps = 10/117 (8%) 

Query: 132 FIRPKFSGSNSKALETSKWTFGKWKRNQYGTYYRNENGTFTCGFLPIFARVGSPKLSEP 191 

F++ GSNS TS N G +K N+YGT Y++E+ +FT I R+ P S P 

Sbjct: 252 FLKSAGYGSNS TSSSNNNG-YKTNKYGTLYKSESASFTAN-TDIITRLTGPFRSMP 305 

Query: 192 NGY^FQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNGKTGNSYSVGIPWG 247 

+ Y+EV DG+VW+GYN G R YLPVR WN TG +G WG 
Sbjct: 3 06 QSGVXRKGLTIKYDEWKQDGHVWGYNTNSGKRv^PVllTWNESTG ELGPLWG 359 

>gi|79926|pir| |A25881 lysostaphin precursor - Staphylococcus 

simulans >gi| 153047 (M15686) lysostaphin (ttg start 
codon) [Staphylococcus simulans) 
Length = 389 

Score = 69. S bits (167), Expect = 3e-U 

Identities = 48/133 (36%), Positives * 62/133 (46%), Gaps = 20/133 (15%) 

Query 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 258 H FQRMVN S F SN S TAQD P M P F LKS AG YG KAGGTVT PT P NTGW KTNKYGT L YKS E S AS FT PN 317 

Query- 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYWIGYNW-QGTRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 318 -TDIITRTTGPFRSMPQSGVXKAGQTIHYDEVMKQDGHVWVGYTC^ 376 

Query: 235 KTGNSYSVGIPWG 247 

T ++G+ WG 
Sbjct: 377 STN TLGVLWG 386 

>gi| 126496 |sp|P10548|LSTP_STAST LYSOSTAPHIN PRECURSOR 

(GLYCYL -GLYCINE ENDOPEPTIDASE) >gi | 79927 | pir | | S01079 
lysostaphin precursor - Staphylococcus simulans bv. 
staphylolyticus >gi | 581744 | emb | CAA2 9494 | (X06121) 
lysostaphin (AA 1-480) (Staphylococcus simulans bv. 
staphylolyticus) 
Length = 4 80 

Score = 69.5 bits (167), Expect - 3e-ll ^ , " ,, c9 * 

Identities » 48/133 (36%), Positives = 62/133 (46%), Gaps = 20/133 (15%) 

Query 131 HFIRPKFSGSNSKALETS- --KVNTFGK -WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + " K +GK WK N+YGT Y++E+ +FT 

Sbjct: 349 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 408 

Query 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNG 234 

IRP SP + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 409 - TD 1 1 TRTTG P F RS M PQS G VL KAGQT I HYD E VMKQDG HVWVG YTGN SGQR I YL P VRTWNK 467 

Query: 235 KTGNSYSVGIPWG 247 

T ++G+ WG 
Sbjct: 468 STN- - -TLGVLWG 477 

>qi| 3287967| sp|P10547 | LSTP STASI LYSOSTAPHIN PRECURSOR 
" <GLYCYL-GLYCINE~ENDOPEPTIDASE) >gi| 2072411 (U66883) 

lysostaphin (Staphylococcus simulans) 
Length =4 93 

Score = 69.5 bits (167), Expect =» 3e-ll 

Identities = 48/133 (36%), Positives = 62/133 (46%) , Gaps = 20/133 (15%) 

Query 131 HFIRPKFSGSNSKALETS- --KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K " +GK WK N+YGT Y++E+ +FT 

Sbjct: 362 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 421 

Query: 176 FLPI FARVGS PKLSEPNGYWFQPNGYTPYNEVCLSDGYWIGYNW -QGTRYYLPVRQWNG 234 
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I R P S P + Y + EV DG+VW+GY G R YLPVR WN 

Sbjct: 422 -TDIITRTTGPFRSMPQSGVLKAGQTIHYDEVMKQDGHVVTVGYTGNSGQRIYLPVRTWNK 480 

Query: 235 KTGNSYSVGIPWG 247 

T + +G+ WG 
Sbjct: 481 STN TLGVLWG 490 

>gi|3341932|dbj|BAA31898.l| (AB009866) amidase (peptidoglycan 
hydrolase) [bacteriophage phi PVL] 
Length =484 

Score = 68.3 bits (164), Expect » 6e-ll niii 
Identities , 52/150 (34%), Positives - 71/150 (46%), Gaps = 17/150 (11%) 

Query 3 S QQQ AKEW I YKHEG AGVD F DG AYG FQCMDLS V A YVYY I TDG KVRMWGNAKD A I NND F KG L 62 

+ + QA++W G + D YGFQC D + + + I G+ R+ G 

Sbjct: 4 TKNQAEKWFDNSLGKQFNPDLFYGFQCYDYASMF-FMIATGE-RLQGLYAYNIPFDNKAR 61 

Ouerv - 63 ATVY -KNTPSFKPQLGDVAVYTN- - - GQ YGH I QCVXSGNLD YYTCLEQNWLGGG F - - 113 

Y KN SF PQ D + V+ + G GH+-+ V S NL+ +T QNW G G+ 
Sbjct: 62 IEKYGQIIKNYDSFLPQKUUVVFPSKY^^ 121 

Query 114 DGW- -EKATIRTHYYDGVTHFIRPKF 137 

GW E T HYYD +FIR F 
Sbjct: 122 GVAQPGWGPETVTRHVHYYDDPMYFIRLNF 151 

Query- pt| 110882 44AHJDORF012 Phage 44AHJD ORF | 8391-8813 | 3 1 
(140 letters) 

,qi|l40528|sp|P2481l|YQXH.BACSU HYPOTHETICAL 15.7 KD PROTEIN IN 
SPOIIIC-CWLA INT ERG EN I C REGION (ORF2) 
>qil322189|pir| [B44816 orf2 5'of autolytic amidase - 
Bacillus subtilis >gi|!42801 (MS9232) open reading frame 
2 [Bacillus subtilis) >gi| 1217874 |dbj | BAA06959 1 (D32216) 
ORF121 (Bacillus subtilis] >gi | 1303767 |db 3 | BAA12423 | 
(D84432) YqdD [Bacillus subtilis! 

>gi|2635036|emb|CAB14532| (Z99117) alternate gene name: 
yqdD; similar to holin (Bacillus subtilis) 
Length =140 

Score » 80.4 bits (195), Expect = 6e-15 

Identities = 45/130 (34%), Positives - 67/130 (50%), Gaps - 3/130 (2%) 

Query- 4 VTCFRFTDSEAFHMFIYAGDLKLLYFLFVW^ 63 
Query. ^ ^ ^ g +K L L VL +D++TG+ KA K L S+- + G+ +K 



. F D ++ p G +K L L VL +D++TG+ * u o<r -r ™ 
Sbjct: 8 INFETI^LARWLF---GGVKYU)LLLVLSIID^^ 64 

Query: 64 XXXXXXXXXX^^ 

^ y G L T+ +YIANEGLSI EN A++ V +P I D+L+ I 

LNFFAVIIANVIDTVTjNLNGVXTFGTV^ 



123 
124 



Sbjct: 65 

Query: 124 KNDTEKSDNN 133 

+N+ E+S NN 
Sbjct: 125 ENEKEQSKNN 134 

>gi|412663l|dbj|BAA36651.l| (AB016282) ORF45 [bacteriophage phi -10 5] 
Length » 135 

Score = 76. 1 bits (184), Expect - le-13 ,...,„ 

Identities = 44/US (38*) . Positives » 61/US (52%), Gaps = 4/115 (3%) 

Ouerv 21 GDLKLLYFLFVXMFVDIITGISKAIiO'NNl'WSKKSMRGFSKKXXXXXXXXXXXXXXXXXX 80 

" G*+K L + VL +DIITG+ KA K L S+ + G+ +K 

Sbjct : 17 GEVWLDLMLVl^IIDIITGVIKAWKFKELRSRSAWFGYVRKMLSFLVVIVANAIDTIMD 76 

Ouerv- 81 XKGGLLMITIFYYIANEGLSIVENCAEMDVLVPEQIKDKLRVIKND-- - -TEKSD 131 
Query- * t+ +yianeglsi w At+ v + P I D+L VI*+D TEK D 

Sbjct: 77 LNGVLTFATVLFYI AHEGLS ITENLAQIGVKI PAVITDRLHVI ESDNDQKTEKDD 131 

>qil 141088 I spl P26835 | YNGD CLOPE HYPOTHETICAL 14.9 KD PROTEIN IN NAGH 
9 ' S'REGION (ORFD) >gi | 1075967 | pir| I ^"905 hypothetical 

protein D - Clostridium perfnngena »gi 1 4SS1S4 (M81878) 
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ORF D [Clostridium perfringens] 
Length =132 

Score =60.9 bits (145) , Expect = 4e-09 

Identities =38/127 (29%), Positives = 63/127 (48%), Gaps = 3/127 (2%) 

Query: 1 MNEVKFRFTDSEAFHMFIY-AGDLKLLYFLFVLMFVDIITGISKAIKNNNLWSKKSMRGF 59 

+N +K+ +1+ A D+ L+ L V +F+D +TG+ K K+ L S +RG 

Sbjct: 5 INYIKWGIVSLGTLFTWIFGAWDIPLITLL- VFIFLDYLTGVIKGCKSKELCSNIGLRGI 63 

Query: 60 SKKXXXXXXXXXXXXXXXXXXXKGGLLMITI -FYYIANEGLSIVENCAEMDVLVPEQIKD 118 

+KK +1 + +YI NEG+SI+ENCA + V +PE++K 

Sbjct: 64 TKKGLILVVLLVAVMI^RLLDNGTWMFRTLIAYFYIMNEGISILENOUU^VPIPEKLKQ 123 

Query: 119 KLRVIKN 125 

L+ 4- N 
Sbjct: 124 ALKQLNN 130 



>gi | 2293160 (AF008220) YtkC [Bacillus subtilis] 

>gi|2635548|emb|CAB15042| (Z9.9119) similar to autolytic 
amidase [Bacillus subtilis] 
Length = 134 

Score = 36.4 bits (82), Expect = 0.099 

Identities = 25/109 (22%), Positives = 41/109 (36%) 

Query: 17 FIYAGDLKLLYFLFVTjMFVDIITGISKAIKNNNLWSKKSMRGFSKKXXXXXXXXXXXXXX 76 

F + G LLM++I+ K + L KK KK 

Sbjct: 20 FFFGGFQYSFLILLSLMAIEFISTTLKETIIHKLSFKKVFARLVTGCLVTLALISVCHFFD 79 

Query: 77 XXXXXKGG LLM I T I FYY I ANEG LS I VENCAEMD VLV P EQ I KD KLR V I KN 125 

♦G ♦ +1 +YI E + IV + + + VP+ ♦ D L +KN 
Sbjct: 80 QLLNTQGSIRDLAIMFYILYESVQIWTASSLGIPVPQMLVDLLETLKN 128 



>gi|H81973|emb|CAA87743.l| (Z47794) holin protein [Bacteriophage 
CP-1] 

Length =134 
Score =31.3 bits (69), Expect =3.3 

Identities = 27/88 (30%), Positives = 36/88 (40%), Gaps = 5/88 (5%) 

Query: 29 LFVliMFvT)IITGISKAIKNNNLWSKKSMRGFSKKXXXXXXXXXXXX 86 

LF L+ D ITG KA K S ++G K G +L 

Sbjct: 18 LFALILFDFITGFLKAWKWKVTDSWTGIJCGVIKHTLTF 77 

Query: 87 M I T I FYY I ANEG LS I VENCAEMD VLVPE 114 

+ ♦ I Y A LSI+EN A M V +P+ 
Sbjct: 78 LVIINLYYA LSIMENLAVMGVFIPK 102 - 
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Table 21 



Phage 182 complete genome sequence. 17833 nucleotides. 



1 tagaatattg tcataaaaca caaacataat aatgcatatt attgtttaca aatatgtaat ttcgtgatat 

71 aatatatttg taagttaaag gaggcgacaa aagaacaaat cataaatgct ttagaaattg caaaaactat 

141 tggaggaaaa ataatgaaat attcactaca acaaatagat^ gaaattaaat caacaatttt cagaattaga 

211 ttaaaaaggc atgaactaga ggaattggtg gacgaagtaa' acgatattgc taaagatccg gaggaaagat 

281 atctCCtatc gttttattac acagaagaag aacgtttgtt tgaaattccc tctgcaagat taatagatta 

351 ttacaacgaa aagatcacaa atctgaaatc ggaaatcata tcactcgaaa aaagattaca aaaactagta 

421 aaataattac acaaaaagct ttacaaatat aacacatcat gttatactaa aagagtagta agggaacgga 

491 aaatacctta cttcacacct caatcattct tatcaaaata caaaaggagg gaaaataatg ggtcgaaaac 

561 taatgcaacg aaacgtaaca tcaactaaag tagaattctc agaagttatc gtacaagatg gagcgccaac 

631 aattgtacca tgcgaaccag ttgtcttaac aggaaaactt tcagaagaaa aagctttatc agcgatcaaa 

701 cgtaaaaacc ctgataaaaa cgtagttgta acaaatgttt cacatgaaac agcgctttac acaatgccag 

771 tcgataaatt tatcgagtta gcagacaaat caacacaagc ctaataaaaa caaaactaaa acaaaacaga 

841 ggagattata atcatggaaa tcgtaaaaag cacatttgac acacaaacac cagaaggaat gttacaagta 

911 ttcaatgcca caaacggggc ttcaattccg ttacgtaacg caattggcga agtactagaa ttgaaagata 

981 ttctagttta ctcagacgaa gtttctggtt ttggtggagc cgaaccatca caagcagaac tagtcgcttt 

1051 ettcacagaa gatggtaaaa cttatgcggg tgtatcagca gtagcaacaa aatcagctaa aaacctaatt 

1121 gatatgatga ctgctaaccc tgacatcaaa ccaaaaattt cttttgtcga aggaaaatca aacggtggac 

1191 aaaaatttgt aaatctacaa gtggtttcac tgtagcataa aaatacagga atctagtaag ccacttagcg 

1261 aatctcgcta ggtggttttt attatgtttc tacattgagg tgtgtagaat tgaccgtaag aatatcaaag 

1331 aatgatagag ccaagttaga gaaaatctac ggtaaatcta acaaagctcg taaaaaatac aatcgtttaa 

1401 gacaaaaagg agttgaggaa aggcaacttc caactgttcc aacatcaaag aaaagactta ttgactacgt 

1471 aaaatcaaca aatatgagtc gtagtgattt taacaagatg ttagacgagt tggtagattt tgcacaacct 

1541 tacaacgaga attacatttt tgagatcaac aagcgaaatg ttgcaatctc aagagcgcaa atcaaagaag 

1611 cgcaaattaa aacagagcaa gctcaaaaag cgaaagaaga acactacaaa gagcttaaca aagttgaagt 

1681 taagaagccc acagaaaaca caattgtcac accaactatt ttaacagagt taggtgctga cttacctttt 

1751 caagcaatac cagattttaa tattgacgct ttcacttctc cagaaggagt tcagtcttat ttagaaaata 

1821 taggaaaaca agacgaacaa tattttgacg aaagagacca actttattac gacaatttca gacaagcgat 

1891 gtttactatt ttcaattcag acgctgacga tattgttcgt ttacttgact caatggggct tgatctattt 

1961 acgaaaacat atgttagtaa cttcttagac atgaaccttg actacattta tgacgaagca gaagtacaac 

2031 agaaaaaaga acaagtttac agtaagattg caaaagtgat cgagtctgaa acaggtggag aagtcccctc 

2101 atataacccc acgaagaaca tcacaattaa ttcagaaaca ggagaagaat tatgattaag aaatatactg 

2171 gcgactttga aacaacaact gatctcaacg attgtcgtgt atggtcgtgg ggcgtatgcg atatagacaa 

2241 cgttgacaat atgacgttcg gtttagaaat cgattctttt tttgagtggt gtaaaatgca aggcagcaca 

2311 gacatttatt tccacaacga aaaatttgac ggagagttta tgctttcatg gttattcaaa aatggtttca 

2381 aatggtgtaa agaagcaaaa gaagatcgaa cattctccac actcatatca aatatgggtc aatggtatgc 

2451 tttggaaatt tgttgggaag ttaattacac aacaacaaaa tcaggtaaaa cgaaaaaaga gaaatctcga 

2521 acaataattt atgatagcct taaaaaatat ccttttccag tgaaacaaat tgcagaagct tttaattttc 

2591 ctataaaaaa aggcgaaata gattatacaa aagaaagacc tattggttac aaaccaacaa aagatgaatg 

2661 ggagtattta aagaacgaca ttcagattat ggcgatggca ttaaaaattc aattcgatca aggactaact 

2731 cgaatgacta gaggaagcga cgctttaggc gattacaaag attggctaaa agctacacat ggaaaatcaa 

2801 ctttcaaaca atggtttcct attttgtctt tagggtttga taaagactta cgtaaagcat acaaaggcgg 

2871 cttcacttgg gtaaacaaag tttttcaagg gaaagaaata ggtgacggca ttgtctttga tgtcaactct 

2941 ctgtatccct ctcaaatgta cgtaagacct ttaccatatg gaacacctct attctacgaa ggagaataca 

3011 aaccgaacaa cgactatccg ctgtacattc aaaatatcaa agtaagactc cgtttaaagg agggttatat 

3081 tccaaccatt caagttaagc aaagttcatt attcattcaa aacgaatatc ttgaatcaag tgtaaacaag 

3151 ttaggagttg acgaattaat cgatcttact cttacaaatg ttgacctaga attatttttt gaacactacg 

3221 atattttaga gatacattac acttacggat atatgttcaa agcttcttgt gatatgttca aaggctggat 

3291 cgataaatgg atcgaagtaa agaacaccac cgaaggggct agaaaagcta acgccaaagg tatgttaaat 

3361 agcttgtatg gaaagttcgg aacaaaccct gacattacag gaaaagtgcc ttacatgggc gaggacggca 

3431 tcgttcgatt gacactagga gaagaagaat taagagatcc tgtttatgtt ccgcttgcta gttttgtgac 

3501 ggcttggggt agatatacta ccattacaac cgctcaaaaa tgttctgatc gcattattta ttgtgataca 

3571 gatagcattc atctagtagg aacagaagtt ccagaagcaa tcgatcactt ggttgatcct aaaaaacttg 

3641 gttattgggg gcatgaaagc acatttcaac gagcaaaatt cattcggcag aaaacatacg tagaagaaat 

3711 tgatggcgaa ttaaatgtaa agtgtgctgg tatgccagat cgaataaaag agattgtaac ttttgacaat 

3781 tttgaagttg gtttttcaag ctatggaaag ttgctaccta aaagaacaca aggtggcgtg gtattagtag 

3851 acacaatgtt tacaatcaaa taaggaggac taataatgga actatataaa gcaatgttta tcgtacgtga 

3 921 tgaaggtact attgacggtt acgatactga acactatgta gatatttctt tacatgactt tgaagaaata 
3991 tatggaaaag aaacacgtga aattgaagca gtaacattag taaaaacagg aaatttaaaa aaataaatta 
4061 tttacatcct ttgcaaagta tggtaaaata ttcttgtgat agttgacaag agtcaaattt ggcgagattg 
4131 ggcgaatgta cacgtgaaat atcgtgcgct cccgttaagt tatggacaca taaacgtttt gaccgtcaac 
4201 caatcgcaaa aaccttttag gagtagccct taaatgtggc tactcttttt tgtgtttcac agaattatgt 
4271 ttcacgtgaa acagttttta tggtataata gaatcaaaag gaggtggaga ttatggaaat taaagaacat 
4341 gaatcaattt taaatggtat tcttgaaagt gtcacagacg gtgaagcaag atcaaagatt gtagaacatc 
4411 ttgaagcatt gcgagaagac tacggagcaa caactgaagc tttgacatca gcaaatagca cacttgaaaa 
4481 gttaaagaaa gataacgaag cgttggttat ttcaaactca aaattgttcc gagaacgagc gatcgtagaa 

4 551 ccagcagaaa ataacgaacc agaaacagac cagaatatta cactagacga tttaggaatt taaggaggaa 
4621 aaaacatggc tgacaaaatc acagaacaag atgtccttcg tgccacaaat gtagaaacac cagtacaatt 
4691 aatgactgct atttataata gttcatcatc tctttttcag gcgaacgtac ctatgccaaa tgcagataac 
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4751 
4831 

4 901 
4971 
5041 
5111 
5181 
5251 
5321 
5391 
5461 
5531 
5601 

5 671 
5741 
5 811 
5861 

5 951 
6021 
6091 
6161 
6231 
6301 
6371 
6441 

6 511 
6581 

6651 

6721 

6791 

6861 

6931 

7001 

7071 

7141 

7211 

7281 

73S1 

7421 

7491 

7561 

7631 

7701 

7771 

7841 

7911 

7981 

8051 

3121 

8191 

8261 

3331 

8401 

8471 

8541 

8511 

8681 

8751 

8821 

8891 

8961 

9031 

9101 

9171 

9241 

9311 

9381 

9451 

9521 

9591 

9S61 

9731 

9801 

9871 

9941 

10011 



atcgaagcgg 
accgtattgg 
catgccttta 
gagtctgtta 
aaggttacta 
tagtttcgtt 
ttattaatag 
atgcaaaaga 
cgctcaagga 
accattgacg 
ttattgatga 
atggtttatg 
tattggttgc 
caacaaaacc 
tatcgcattg 
ttggttaagg 
gtcaatcatt 
ctaaggagga 
cctatacaca 
taacgagaat 
aaagacgcct 
atgcctttgt 
acaaacttat 
tcgaatggaa 
atgtaacaac 
tggagataag 
cctatcaatt 
cgtttcttac 
accattcatt 
ccaacctacg 
cattcgtacc 
tgttaaggaa 
atgactttaa 
ctaataaagt 
caagatgtta 
ggaaacaaaa 
gtgcaatgag 
catcatggga 
ggtaaagtgg 
caggaaactt 
tcgttacttc 
tggaatttca 
aacaaatttt 
agatgtatag 
cagcaaaaag 
ttatcgtaga 
cctcgttatt 
tcatggtttg 
cgaagcaatg 
atgttgtata 
acataaacca 
gaaatacttc 
gatatggagt 
cagaattgaa 
tgcacgtgta 
aaatcaagaa 
tgaagtttag 
tggagggttg 
aaaagaacgt 
cgagcagaat 
catttaagtt 
tcttgaagag 
attgatacaa 
acagaaacaa 
tcaaaaagat 
gaagatttga 
cacgaagcaa 
tacgattaca 
agtgttttga 
gagggaggta 
gaacgcttta 
taattgccta 
acattttgtt 
gaaaatttaa 
ctgaaacacg 
attttggtat 



ttggtgcagg 
taaagtagtt 
ggtcgaacga 
caggggtatt 
caaacaaacg 
gctggtgtaa 
caaactacca 
atttatccgt 
gttaaaacat 
ttgacgtttt 
gtttcctaaa 
atctacgaca 
accaccacca 
tgtcacaaaa 
acatttacac 
caaccgtaaa 
agtaacattc 
caattatggc 
cacaagatgg 
agagattgtt 
tatatgcttg 
tactgatatt 
cgtttcgata 
tacctttcat 
ttttcatcct 
gaagataaat 
caagtgggga 
aacgaaagaa 
gtggatcacg 
ctagtgatcc 
taaaagaatt 
tcaaaactat 
gacctgaata 
gatgatcgag 
atcgataatg 
actccttgat 
tacaggagga 
gcaggacaac 
cagatatcga 
tcaaaactat 
tcaatgtatg 
ttaaattaaa 
tagtgcaggc 
gaaggaggaa 
cagaccttat 
caactcacgc 
tagaaattgc 
cgcaggggca 
tatcacaaga 
ataatgactt 
gatatcacga 
tcattgctac 
ttgacgaatc 
cgaagtatgg 
caaacatcag 
aagagttttg 
aacagacgcc 
ccaagtgcta 
attgaagttg 
ttgaaacaaa 
taatcttgac 
tttccgattt 
acatcaaagc 
aaatacacgt 
ttgagaattg 
gtaaagaaac 
tgcttctgaa 
cgatataaag 
gaattgagaa 
gcaacaatgg 
gcaaatatcc 
tctgaatgaa 
gagaagttag 
tcaatgatac 
tgctaacagt 
aagattcaac 



gatcacacgt 
atccgataca 
ttgaagaaat 
taaacaggaa 
atccaagaag 
tgaacgcttt 
agaaaaagag 
aagatcaaat 
ctacctcaaa 
agcagcggca 
aaagaaggcg 
aattgtacaa 
actatattct 
gttgcttttg 
cagtagaagc 
acaaacagca 
acagctatcg 
aagaaggtat 
tttaaaactc 
cttatcaaag 
taactatctc 
gaatataaga 
ttggtatacg 
taatacaatt 
aacgatggag 
caggaggatc 
ggtatacaaa 
ccttttttaa 
cgaacaaaac 
aacaggaaca 
gatcttgtag 
ttatgtatcc 
tcttacaggt 
ccgattgatt 
atcctaacga 
tgctcaagag 
gcgatctttt 
aagtaaacaa 
aaatattcca 
tatcaattgc 
gcacaaagag 
agaaccaaat 
gttacgcttt 
taagatgagt 
ccaaatgaac 
tccttacgtt 
tttacacact 
gaagatggtc 
gatatcctgt 
gaaagttcct 
gtgaatcgaa 
aagcttataa 
ttttaatgta 
aatgaagtgt 
aagtcttatc 
cgatcgtgta 
gttcgacaat 
cttaaacgtt 
gccgaaaaca 
atttatcaat 
gaatatttaa 
ttgatgacat 
gaatcgtgat 
gacacaggaa 
ccagcaatgg 
aacaagctcc 
aaagaaacaa 
gtaaaaaggg 
aatgatcttt 
tagattttaa 
tcatactgaa 
gttggtgctt 
aagagatcac 
tgtttttgca 
gtgaatattc 
gcgacaatac 



ttagacgtag 
aatcttggcg 
ttttgttgac 
gttcccgatg 
catggttaga 
atacacaggt 
ctattcaaag 
caacctctaa 
atctgatcaa 
ttcaatatga 
aagaatcgtc 
aacaacaagt. 
acttctcaat 
caagtgcaac 
aacaaaccaa 
ggtaaagcga 
gaggtcaaca 
acaaatgtaa 
aacaggaaca 
ggatacacaa 
atctttaaaa 
atgacaacac 
agaaagtttc 
gaagagtcgc 
tcaattttct 
aatagtaggt 
ccaaatgggg 
ataagatagt 
ggtaaggtat 
atgaaaacat 
ggaacgtgta 
ctattgttta 
ggtaaattga 
atgatgtaag 
tgtaggagtt 
caaaacattc 
cagccttagc 
ctatgtttct 
gataatgtaa 
gcttcaaaca 
caatcgagta 
attgtaggca 
ggcatacgaa 
agacgaaaag 
cctattcaag 
tcagttgttt 
aatggttatc 
aaatcgatca 
tttaagatat 
acgttaccaa 
gagcgcaaaa 
ccaaattgac 
tggcaaacaa 
taacttttct 
taacaatgaa 
aatcgtgtct 
tacaactggc 
atattgaaag 
attgtttgat 
cacttttact 
atctaaacat 
ggactacacc 
gaatcgaaga 
caaccgattc 
agatggaaca 
acaggcgttg 
agaacacaga 
aaacactgat 
agagaaatga 
ccccgacaag 
tacagatatg 
tagttaatga 
aaatgacaca 
aattatatca 
ttttgacaaa 
tgattatgga 



taaaaaacga 
taaccctttg 
attgcacagg 
taaaaacatt 
aaaagcattt 
gacgaagtaa 
agatcgaaat 
caaattagaa 
tacgttatta 
gtaaaactga 
aaatattgtg 
, ctatacaacc 
tcgggaacgc 
aactagtgtt 
caaggagaag 
ctgccgtaac 
agcaacggtt 
aattgttggc 
ggaatcgtac 
ctcgggggag 
acgaagaaac 
aagtttcgtt 
attgcaaaag 
ttgattacgg 
tgttattcta 
ggcccatctc 
caggcaatgc 
cgggatgtat 
aatgcaggag 
tcgctttctt 
taactacttt 
atagaaatta 
gtgtatatgt 
taactcaacc 
aaatctgact 
gcaatacttt 
aagtaacaac 
gaaaaagaaa 
cacagcttgg 
aattaaatat 
gctacaccaa 
caatgagtaa 
tgatgttttg 
gtgcaggact 
tgatgtagaa 
gaatgggaaa 
ttggtttctt 
ttatcacaac 
gatgatgatg 
gtttacatcg 
aacacctgta 
gaaaataatc 
atgctccata 
aggtatcaac 
cagattgaaa 
ttggcgatga 
ggcaggtcaa 
tttcacttat 
tttgattatc 
tgagagagat 
gccctattgg 
attgatgaga 
accaaacgaa 
tttctcaagg 
ggtgtaatca 
aaacaaacaa 
cattaataaa 
tatgctgact 
acaaggaagg 
cggtttgacg 
aattactatt 
tatgagtggt 
ctcaaaaaat 
aagaaatcaa 
aaataaaccg 
gccgatccta 



atttatttca 
aaaatgttta 
aacataagtt 
gttccacgaa 
acttcatggg 
gcgaatttga 
tggcgaaatt 
tttatgagtt 
ttgacgccga 
ctttgtagga 
gcagttattg 
ctgaagggtt 
tgttgctttt 
gttaaaggat 
ttgtttcatc 
cgtagaaggc 
cttgttacgg 
taacgtgcct 
tttaattcgt 
tttttagagt 
ttatcctagt 
acctttgaaa 
aacaccctca 
tagagaatac 
acaagtgaag 
ctttttccta 
taattttgga 
gtaacgtcgt 
gttcttataa 
ttgtgtaaaa 
agagaagctt 
cagatacaaa 
aaaaggttcg 
attattacca 
atgcttctgc 
cagacatggt 
ccttttgttg 
acggtttgaa 
atcaaactta 
gagtatgcaa 
acttacaaac 
cgatgtatta 
aattataacc 
tgctagaaat 
gaaatcagct 
atttgccaaa 
taaagaccct 
cctattttct 
atgataaatc 
ttttgcttta 
attattcaaa 
aggctgtttt 
tgtagtagat 
aatgctaacg 
gttcaggtaa 
acttgacgga 
tcaaaaaaag 
taccaacctg 
cgttttatga 
aggctcagaa 
aataaaatgt 
aacagaaatt 
gcaagtagat 
aacacttata 
attatgcaac 
cgacaaaaca 
gatcaaaatc 
tactcgaaaa 
cttatttctc 
gtttacccgc 
agatgaagaa 
tatttaaatt 
ggttgtctga 
aagattacaa 
gatgttgctg 
ttgacacgtt 



actttagttg 
aaaaaggaaa 
caaccctgac 
attaatcgtg 
ataatttcaa 
atacacgaaa 
actgaatcaa 
ccgcttacaa 
cacagacgca 
cacaaaatcg 
tagatagtga 
atattggaat 
gttaaatcag 
catctaaaga 
agcaccagca 
ttagaagtcg 
ttacttctga 
tttgataaca 
ttcctgttct 
agataaacac 
aaatggcagt 
ttgatgtttt 
actttattat 
acaacaacaa 
caatgccagt 
ttatttactt 
gagtacatgg 
atacaggtat 
gatcatgctt 
gaagcaagaa 
ttccgtttaa 
aggacatgta 
ttaggaattt 
atttaagtga 
attcatgcaa 
atgggaaaca 
gtttgactaa 
cctcttggca 
tctttcacaa 
caagacttga 
aagaaaagca 
acacgtgtga 
aagacaacgg 
aaccgttata 
actatgaaca 
atcaattgac 
acacttgggt 
ttacagcaaa 
aaaatgtatc 
gatatggcgg 
ctgatgaaaa 
tgtggataaa 
aaactacgat 
tagataagac 
catcttgtta 
aagattgacg 
accagatgag 
aattatctcg 
cgaaacaaaa 
acgatgggat 
tcctatcaaa 
gttaaatgag 
caaacagaca 
cagacacccc 
aaatatcaca 
aatcaaaata 
aaaccaaaga 
atatcgtaga. 
cttgtttatg 
tgtattcaaa 
gtatcggctt 
actttatcga 
tggtacgtta 
atcttggttg 
atgatcgaac 
acgtattgtt 
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10081 

10151 

10221 

10291 

10361 

10431 

10501 

10571 

10641 

10711 

10781 

108S1 

10921 

10991 

11061 

11131 

11201 

11271 

11341 

11411 

114B1 

11551 

11621 

11691 

11761 

11831 

11901 

11971 

12041 

12111 

12181 

12251 

12321 

12391 

12461 

12531 

12601 

12671 

12741 

12811 

12881 

12951 

13021 

13091 

13161 

13231 

13301 

13371 

13441 

13511 

13581 

13651 

13721 

13791 

13861 

13931 

14001 

14071 

14141 

14211 

14281 

14351 

14421 

14491 

14561 

14631 

14701 

14771 

14841 

14911 

14981 

15051 

15121 

15191 

15261 

15331 



gcaatcaata 
gtgtataatg 
aaagccgtta 
acgaaacaac 
attaaatgta 
ggtattcgtt 
ttaatgcctg 
taagaggtaa 
agaaattgac 
gaacaagcga 
cacaaaaaag 
attaccatag 
atccaagtgt 
attatatgca 
ataaagttta 
caaaacgtat 
atcaatcaca 
aattgcgttt 
aaagaagaaa 
cattgcatgg 
tgatgtaagt 
gaagaaaaaa 
ctaacaccat 
gtttgaaaca 
ccacaaggcg 
atttgtttgt 
atttgaaaat 
atgtggcaag 
ttcatttttg 
ttctggtaac 
aacaaccatt 
ttgatgtaga 
tgttgtttac 
acgttgtatg 
accggtttat 
aacaaaagta 
gtgctaacag 
acaaatagct 
ggagttgtcg 
gatatgggtt 
tgctaaagct 
gataatacac 
atattgatgt 
acttgatctt 
ggaaccccaa 
caggaaacgg 
aatgattgca 
ataaatgatg 
ttggcgacaa 
taaaaaagat 
tttttagggc 
atcttattta 
agaatatatc 
gcaatgatta 
ttaaagctaa 
agtaaaattc 
tatagtatac 
tagacggaac 
ggaattgatc 
accctgattg 
gcatgagagg 
attggtaaag 
ttcttgatta 
aactgatttt 
caaggctact 
gtaaaggacg 
ggatctgtat 
gatgagttta 
aattgtctga 
atcaatggtg 
taggagtgta 
caatgcttta 
tataaaaaat 
cagaacttaa 
aaaaggaaaa 
gaaaaatcta 



aagttagtgg 
gcagacatta 
atattatgac 
gaatacctca 
aatgttggta 
atgtagaggt 
ttacaattgc 
cgctagtgaa 
acagtaacat 
aaacaacagc 
tgcaactgat 
gaggaaaaat 
tcgagcagaa 
gctggtgata 
ctgaaagttt 
tggttgtttc 
cctgatggca 
tccctctaaa 



atcaaaagaa 
aaatgaaaat 
gcttatgggg 
gcgaaatggg 
tgaattgaaa 
atgacggcat 
ctcaaagtgg 
tcgtaactgt 
tgtctattct 
ggaacgatat 
tacagcgatc 
acaatcgaag 
ttctagcata 
tgtttattgt 
ggacattacc 
gcggtggcgt 
tcaaacggct 
aatacaccaa 
gtccaaatgc 
agaggacaaa 
ccaatctcca 
aggtcaatgg 
gaaacgttgg 
ctgtttcttc 
tgctacaatt 
gcacaagctt 
tcaagaatac 
cagaccaaat 
tgttgcgatg 
gtacttacaa 
agttaagaac 
tttatgactg 
aatgttttgg 
tctattgcta 
acacaatggt 
tcgattttgc 
agcaggtatc 
ggtgcagtag 
taggacatat 
actcaacaga 
tttcaaaagt 
tgaccgagca 
ggtttagaag 
ctgttcttat 
tgtttataat 
tctagtattg 
ctcaaccagc 
tttaccagga 
gtaggtaaaa 
ttttcactct 
tccaacacaa 
tggacacctg 
tagtatgaca 
ggttttaatt 
ttgt.cgttaa 
aaagattcct 
gaattctatt 
atgaatatcc 



ctggaatacc 
gaacacaact 
taatagcggt 
gttcaaaatg 
aactaaccaa 
gtaatatggc 
taaaaatgtt 
gctaaaacac 
caaccgcaaa 
aaacagtatc 
ctagctgttc 
aatggcaaac 
aacttgttag 
aaacaaatgc 
gacaaaccct 
gcaaaatatt 
aagtaactgt 
ataaggaggt 
ttacctattg 
caggagagtt 
ttatcgctga 
tatcactttt 
cgtgatgtac 
ttaatgtaaa 
taaaggaatt 
actttaaatg 
ctaatatctc 



caatactagg 
attatcgaca 
gtggcgtaag 
cggaaataga 
cgtaactcac 
gaaacttaaa 
taatttctat 
gacaatcgag 
tgatctataa 
aagtaatgta 
caatcgctaa 
ttgggaatcg 
acgcctaaaa 
aaggtcaagc 
tgcaggttat 
aattttatgt 
atagtaagca 
aaatcttgat 
aatttccatg 
gaacagtaac 
tatcgtttat 
ggacaagttt 
cgttaggatc 
agatggagat 
tccgatgcct 
tggcagatga 
gttaggtttt 
attgttaagg 
gtattacaat 
ttcagatatc 
aaggacgata 
tccatgcgat 
tttcaacaag 
gtacacctca 
tctcgacttt 
aaaacaggcg 
caaaaggcga 
gccacctaaa 
tacaacggca 
aacaggatca 
tacaacaggt 
ctcgatcata 
aacaatttga 
aatagcttag 
gcctaatgtt 
tcgctttatt 
caatttttca 
gtgatgataa 
cgaagttcgt 



gctacaggag 
aacaagtgaa 
acgaatgtag 
ctgtagttac 
tcgaataaca 
agataaaaat 
ctaacaggcg 
ttgcacaaca 
tcaagcgttg 
agcgcagctg 
gagtaagcag 
aaaaatattc 
atttgaccag 
aatctcttat 
gtgatcacaa 
acacaccaaa 
aaatgacaat 
tcatatggaa 
caccctgaaa 
tcaacaattt 
cggtgtaaca 
tattttcctc 
ctgtagttac 
catcgaaagt 
ttctttaatg 
aaggaacgta 
tcaagcaatt 
ggtacaggtt 
atgacgatga 
ttattatcga 
aacgctttgt 
aagtcgaggg 
gattacaggt 
tgtgacttga 
ttaactatga 
agcacctcag 
tataactagg 
aatactttca 
gctggtttga 
gcaatcttta 
agagatcatc 
actaaccctc 
gtcactggga 
tattgacggt 
cctaaaagtt 
atggtttgga 
acatgttgga 
caagaattta 
gcgcaatacg 
ttctttcata 
actggcggag 
tgaatggttg 
taatcatctt 
acaattgcca 
tggcagaaat 
gtatataaca 
gatgatgata 
ttaaatgatg 
tttgtaaata 
ctttgtcttt 
acaagaagcg 
gaagggtcaa 
ttaaagcatg 
ttatggttta 
acaaataatt 
atcttgattt 
aattgttcct 
agcacaagcg 
ttagaggaac 
tatttactta 
gcgttaaact 
gtttgtaata 
aaacacggcg 
aaacaatggc 
attaatgggt 
acaattttgt 



atatttatct 
gatggatcag 
aaggagaatt 
tgccaatcaa 
acattagaga 
attcaaatgc 
actctaatct 
agctaaagaa 
acgaaggctg 
caacggcagc 
tttagaggac 
.aaatgaagga 
tcgtgctgaa 
ctcggtgcag 
cgctaccaga 
tccaacagat 
gtaggtaaaa 
gaacgaattg 
cgaacccgaa 
tgttgacaca 
gattgtacac 
cttgtgaacg 
tttcttagga 
ttcaatattg 
atactcgcaa 
tgttgttgtt 
atcaaaacag 
ttagaggttt 
ttatcagaat 
ggatatgcgc 
ttgagtttca 
aatgaatagt 
aaattatatc 
tggcacaaga 
tgggtttgtt 
actgttttct 
aggatatgag 
aaatatggct 
acccgaacag 
tcgccaagca 
gctcaagggg 
agaccctttc 
acgccctggt 
agcggtggcg 
tcatgagtgg 
ctttggttca 
acaatgggag 
gttataacca 
tgacgcggat 
gatgatggaa 
ataatgacga 
gaaattttaa 
gtttatggtt 
aatttaacaa 
ggttttagtg 
atgttggttg 
ataattggac 
aatggtattg 
ttaaagcaac 
aggtaaaaag 
caattctttt 
atcagaaaga 
gttttatacg 
tgggttgctg 
ttccaattgt 
gaatgttttc 
cctgaaaata 
tgttttattt 
atacaatcat 
aaaatgtatg 
tgaagagaaa 
ggcgcacgtg* 
aacaatttat 
gaaagaattt 
tgggctgttc 
ttgatgagtt 



taacattaaa 
acaatttatt 
gggtacactc 
gcaaaagatt 
gtacagtggc 
aggataaaga 
tgaattagtt 
actgctgctg 
gtacagcaca 
taaaaacaca 
acagcaatac 
tagcaatgac 
ttaacaatga 
taggtatgct 
aggttttaga 
acaaaagaaa 
tcgaatatct 
atattcaaat 
acaagttgtt 
agaaaaatga 
caatattaaa 
tgattcatat 
tcgggagaaa 
atggttttgc 
ttacaatcgt 
gctagaggta 
cttttcccga 
ctttgtgaaa 
gtaattaatt 
ataacttgca 
agatgtggat 
acagctattt 
gttgtcaagg 
agcacctttg 
gttcgtggtt 
ataatcgtag 
atggcaactc 
ataataaaaa 
caatgaatat 
caaatttgtg 
ataaaacagg 
agcatttaaa 
aaacttcata 
gtggcgtaaa 
acaacttttt 
attgatcacc 
cattaagagc 
gtcaaatata 
catttacatt 
catgggaaga 
taacaataag 
taaggagaaa 
tgattatatg 
ggaaatcgac 
gtttacttta 
gtttgatttt 
tgattatgtt 
atatctctag 
aggcggaaca 
attggtgtgt 
tagataatat 
tgtaaattgg 
tatacagcaa 
aatatggatc 
tgcctgtttt 
tatggcgatg 
aaatatttga 
tgacggagaa 
gttcatggaa 
aaaagaaacc 
aacttatact 
gtataggtaa 
ttatttaaga 
cctgatcata 
cacttagtac 
tttaattgag 



ggaacggagg 
tccaatttca 
aaacaaaatg 
ctgtagctga 
taatcttgat 
tcataatcgt 
aatgctgaaa 
gtttgtcaac 
acaaaccgca 
gctgattcag 
aatatactgt 
aataatttat 
caaattgtca 
cgaaggtatg 
ccaataagaa 
tggtttatgt 
atccctagat 
gaacaagatg 
tttgatgaaa 
caactacaat 
taaattactt 
tatcgctttg 
cgacattaaa 
attatggttg 
tttgactttg 
gaggggttac 
tgtaaatggt 
aacaaccgta 
tctgtgaaat 
tgtccaaaac 
caagcttata 
cacgtttaat 
acatgttatc 
acggacggtt 
tgtctaattc 
aatcgatcat 
ttacaaatga. 
ttcacaagta 
ggtggaggcg 
ggttgtctaa 
tcaatggatg 
caatctgcaa 
tcgaagaaag 
acgttgctat 
ggcacgcatg 
ctggcaatga 
gtattttgtg 
aaggtaaaag 
taggttttac 
ccctttgaag 
gataaaaatg 
aaggtatgat 
gttaatggtt 
tttagtagtt 
ttcctgtagc 
atcagaaatt 
aagaagtttt 
ttatcaaaca 
ggttatgtaa 
atcattttgc 
taagggttac 
gcgaaagcat 
acctcaatac 
aaatcaacca 
cagtttacaa 
gtaatacatg 
cgccacaagt 
acgatctttg 
aagaaatccc 
agtatataaa 
ataaccctaa 
aacttatggt 
agattcaaaa 
aacttgaagt 
gtggggaatt 
aaatcaaaaa 
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15401 

15471 

15541 

15611 

156B1 

15751 

15821 

15891 

15961 

16031 

16101 

16171 

16241 

16311 

16381 

16451 

16521 

16591 

16661 

16731 

16801 

16871 

16941 

17011 

17081 

17151 

17221 

17291 

17361 

17431 

17501 

17571 

17641 

17711 

17781 



tcacttattt 
tacaagatgt 
ccagatttga 
actttgcaga 
tatcaacaat 
tgcgccattg 
gttatgatta 
gctgatgaaa 
cggtttgata 
attttagtag 
gcgatagttt 

ggtgtgttaa 

aaatgtagct 
cggctatatt 
tataaaatac 
ccttttggta 
cctgacaata 
ttcggtgata 
tctgaaaggt 
caatcatttt 
tgttcaacgc 
atttatcatg 
tgttatttct 
attaatgata 
ataagattgg 
atccatatct 
tcaataagat 
gaagtagaga 
aatttgatat 
attgagaaag 
caaattctaa 
tgaataaatt 
atcatgtatt 
gatcctttct 
tgtagtttgg 



accaaacgaa 
gttatgttga 
ataagcgttt 
agtgaagaga 
gagtttgtca 
cttttgaagg 
tcaaccaaat 
aattggcgaa 
acattgttat 
agctaccacg 
tgttttggtt 
tgtagacgaa 
ataggacgtc 
ttaatgcttt 
tgtgatatcg 
tttgtaacgc 
cttttcaaga 
tttatttccg 
tacgtttaca 
aattcctcct 
ttttcattga 
tgttaacacg 
gacttgatag 
aattgttaat 
tagcattgta 
aattccttta 
aatgtttatt 
tacctctcct 
tgataccacc 
tccagttatc 
atagaggaat 
tctgtgtata 
tacatatatg 
ttattacatc 
ggtcagttac 



gctgaagcct 
gtaatgcaac 
taatctatat 
gaaacacctc 
atgatagtga 
gaaaatcttt 
acaaatcatt 
ataattatta 
taagaattta 
attagttcta 
ctttggcgtt 
atcttttctc 
catttctttc 
tgttaaggtg 
tatattggtt 
taactgatag 
atgttaaatt 
gaacgtcgaa 
gtagaaacgt 
atttgtccgt 
tttcgttatt 
aactcttttg 
acgctaaact 
catgtaaaac 
tcgaattaat 
gttcttcaaa 
gttttcggta 
ttttcagcta 
aatcaaatgt 
atcaaatgaa 
ttactaagtt 
cgatcggttc 
tcaatcattt 
tatattatat 
atttgtgtta 



tattgaacat 
tagtgtagtg 
caagatcgag 
ttggtagatt 
tacgtttatc 
gggtattgga 
tttatgcaat 
tctttcaaca 
cattatgatt 
ttacaatgat 
agtgattttt 
atagttcttt 
tattctaacg 
agaggttcgg 
ccttgtagaa 
cgagaaccaa 
gactcgattc 
atcttgtaaa 
aaccattcaa 
aatttgttta 
gcgatattaa 
taacgtaatc 
atcgttgtca 
acccctttta 
atgttatttc 
agataacaaa 
tctatgatat 
ttaatgattt 
gattggtagc 
attgttttat 
tatcctcatc 
attcatgttt 
aattcattta 
catgtatgat 
tcaaaaaaag 



gatggaaacg 
aacccttatt 
gtatattgat 
gattcgtgga 
gaaaagagaa 
tagacgctga 
gactacgaaa 
gtggcgaaag 
tgtttaataa 
gaatagtaga 
gctaacgcct 
.ctccttatac 
caattcacta 
tcttgtgtat 
tgtagccatt 
cttttacgta 
gggtaatagc 
gtcccctcta 
ttagttcgcg 
tatccgtcat 
tgcaatggct 
aatgtataaa 
tctttagtta 
tattaatttg 
tgtagttttc 
caatattcct 
gataattcat 
attgttcata 
attgtattaa 
tttcaagtaa 
tctaaaaatt 
atcatccttt 
ttttaatgat 
tgtatttgtc 
ataatattct 



gttttccgaa 
tcttgtattt 
tgaattgtgt 
acagaatacg 
gtaaaaatag 
aacaggttgt 
gaccatgaag 
cattcaagaa 
gatgaaaatc 
taacatagta 
ttttgtttgc 
agttttaata 
tatccatttc 
caaaacctcc 
attccacctc 
tgaagttact 
gttgaatgag 
tgatctctat 
gtgttctttg 
gtttcaattg 
atcaagataa 
attaattgtt 
gttgatttaa 
atattgatac 
catgaatact 
catcgcctac 
atcccactca 
tgaaacactc 
attaatattc 
ctttttagcc 
ttcatacata 
ctttattaca 
ttatttgatt 
aacaattaaa 
att 



gacgtacaaa 
caatctgcag 
gattcaaaag 
aagatcttag 
tagtttctta 
gtctatgtga 
aaaatagatt 
tagttatctg 
tggtaaccct 
attgtagtct 
ttttggatcg 
attccctgta 
taggtatata 
caaccatcta 
ctttaaatag 
aatttcattg 
ttaacaaaag 
ttttccattg 
aatgttcgtg 
ttccgcatag 
acatagttat 
ttcctccttg 
accctctaaa 
caccaatcga 
cggaaataag 
ctcatcaata 
ttaaaggggt 
cttttatatt 
tggataattt 
tcatccacct 
ccacgttatt 
tatatagtat 
gtttttttat 
ttcatataaa 



335 



Table 22 



Phage 182 ORFs list 



nb 


Name 


Frame 


Position 


Size 
(a.a.) 


Key words 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 

61 

62 

63 

64 


182ORF001 
182ORF002 
182ORF003 
182ORF004 
182ORF005 
182ORF006 
182ORF007 
182ORF008 
182ORF010 
182ORF009 
182ORF011 
182ORF012 
182ORF013 
182ORF014 
182ORF015 
182ORF018 
182ORF020 
182ORF019 
182ORF016 
182ORF022 
182ORF023 
182ORF017 
182ORF024 
182ORF025 
182ORF026 
182ORF027 
182ORF028 
182ORF021 
182ORF030 
182ORF031 
182ORF032 
182ORF033 
182ORF034 
182ORF029 
182ORF035 
182ORF036 
182ORF037 
182ORF038 
182ORF039 
182ORF040 
182ORF041 
182ORF042 
182ORF043 
182ORF044 
182ORF045 
182ORF046 
182ORF047 
182ORF048 
182ORF049 
182ORF050 
182ORF051 
182ORF052 
182ORF053 
182ORF054 
182ORF055 
182ORF056 
182ORF057 
182ORF058 
182ORF059 
182ORF060 
182ORF061 
182ORF062 
182ORF063 
I 182ORF064~ 


2 

1 

1 

3 

3 

1 

1 

2 

2 

2 

1 

3 

1 

3 

2 

-2 

3 

3 

-3 

1 

-2 
1 
3 
2 
-3 
-1 
3 
-3 
-1 
-3 
-1 
-1 
-1 
-3 
-3 
-3 
-1 
3 
2 
-3 
1 

-3 

-3 

-1 

-2 

2 

-2 

-3 

3 

2 

3 

1 

2 

-2 

2 

3 

-3 

-1 

-2 

-3 

-3 

-2 

-3 

I -2 


5966..7780 
2152..3873 
11305.. 12639 
4626..59S4 
12651. .13700 
14995.. 16026 
7795..877S 
14105..14983 
1310-2155 
8765..9601 
9607..10158 
10872.. 11294 
10456.. 10860 
13716..14108 

854.. 1225 
16429.. 16737 
10158.. 10454 
4323..4613 
16749..17033 
12868.. 13149 
11914.. 12189 
154..426 
6174..6446 

548..814 
12999.. 13259 
14642.. 14896 
14430.. 14672 
17106..17339 
16199.. 16429 
8379.. 8603 
11195.. 11413 
4727-4942 
5951 ..6160 
17412..17606 
15570.. 15758 
2127..2315 
12095.. 12280 
14769..14951 
9992.-10171 
16029.. 16202 
3886..4056 
10671..10832 
10491..10652 
6299.-6457 
6571. .6729 
2372..2527 
13201..13353 
3243..3395 
1578..1724 
8012..8155 
9390-9530 
4096-4233 
15656-15793 
8002-8136 
8324-8455 
6549..6680 
i 8133..8264 
5048..5176 
15748-15876 
15276-15404 
1974..2102 
1867-1992 
14181-14306 
I 7234..7356~ 


604 
573 
444 
442 
349 
343 
326 
292 
281 
278 
183 
140 
134 
130 
123 
102 
98 
96 
94 
93 
91 
90 
90 
88 
86 
84 
80 
77 
76 
74 
72 
71 
69 
64 
62 
62 
61 
60 
59 
57 
56 
53 
53 
52 
52 
51 
50 
50 
48 
47 
46 
45 
45 
44 
43 
43 
43 
42 
42 
42 
42 
41 
41 
I 40 


Tail pmtflin; 

DNA polymerase; 
Major head protein; 

Glycyi-Giycine endopeptidase; Lysostaphin precursor; 
Encapsidation protein; ATG/GTP-bmdinq site motif A; 
t Ipppr cnllar protein: 
Lysozyme: Muramidase; 
Terminal protein; 

Lower collar protein; J 

Pre-neck appendage protein; 

Lysis protein; 
Early protein; 

Leucine-zipper motif; 
Head protein; 

Early protein; 

Early protein; - . 
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65 



66 



182ORF065 



182ORF066 



3460-3582 



40 



4234..4353 
13763..13882 



39 



39 



67 
68 



69 



70 



71 



182ORF069 



182ORF070 



182ORF071 



■3 
-3 



7148..7267 
4908..5027 



39 
39 



2 
-3 



912-1031 
11741 



11857 
11610-11723 



38 



37 
37 



72 
73 



182ORF072 
182ORF073 



2763, 
8813 



.2876 
.8923 



36 



74 
75 



76 



77 



78 



79 



182ORF074 
182ORF075 



182ORF076 



132ORF077 



182ORF078 



182ORF079 



182ORF080 



-3 



2 
-2 



-2 



7353, 



2316 



.7463 
.2426 



36 
36 



11858..11965 



35 



7564..7671 



,.7488 
4372-4473 



7381. 



35 
35 



33 
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Table 23 



Predicted amino acid sequences of ORFs from phage 182 



182ORF001 aaggtaCacaaacgtaaaattgttggctM 

, MARRYTNVKLLANVPFDNTYTHTRWFKl 

6050 caacaggaacaggaatcgcactttaattcgtttcctgttcttaacgagaacagagactgttcctatcaaagggatacacaactc 

,9 qqeqe SYF NSFP VLNENRDCSYQRDT Q L 

34 gggggagttctcagagtagacaaacacaaagacgccctatatgcttgtaactatctcatctttaaaaacgaagaaacttatcct 

57 GGVFRVDKHKDALYACNY LIFKNE - Y ' 

6218 agtaaatggcagtatgcctttgttactgatattgaatataagaatgacaacacaagtttcgttacctttgaaattgatgtttta 

„, S KWQ YA FVTO I BY KMO N T S FVT F E I . 

6302 caaacttatcgtttcgatattggtatacgagaaagtttcattgcaaaagaacaccctcaactttattattcgaatggaatacct 

... QTYRPDIGIRESFIAKEHPQLYYSM GIP 

6386 ttcatcaatacaattgaagagtcgcttgattacggtagagaatacacaacaacaaatgtaacaacttttcatcctaacgatgga 

,.\ FINTI EESLDYGREYTTTNVTTFHPNDG 

6470 gtcaattttcttgttattctaacaagtgaagcaacgccagttggagataaggaagataaatcaggaggatcaatagtaggtggc 

169 VNFLVILTS EAMPVGDKEDKSGGSIV G G 

6 5S4 ccatccccccccccctattatttacttcctatcaattcaagcggggaggtatacaaaccaaatggggcaggcaacgctaatttt 

1,7 psPFSYYLLPIH SSG. EVYKPNGAGNAK F 

His ggagagtacatggcgtctcttacaacgaaagaaccctttttaaacaagatagtcgggatgtatgtaacgtcgtacacaggtata 

GEYMAFLTTKE P FLNK I VGMYVTb x . . 

6 r ccaa=aggaacaa t gaaa™ 

0 gggLcgtgCataactacCttagagaagcctCtccgCttaatgttaaggaatcaaaactatttatgtatccctattgtCtaata 

309 GHVYMYFREAFPFNVK E ?„lJ„^J a *LLt 
6974 

Vol loi^nu*^^ 



6 974 gaaattacagatacaaaaggacatgtaatgactttaagacctgaatatcttacaggtggtaaattgagtgtatatgtaaaaggt 

337 EIT DTKGHVMTLR P EYL T G G K __^ J^^^.^^atLe 

70S8 tcgteaggaatttctaataaagtgatgatcgagccgattgatta . »i >-=>. , a * „u 

"« aagatgkatcgataatgatLtaacga^agga^ 

393 k m lidNDPNDVGVKSDYASAFMQGNK M S 

cqc LSFTTGMF.QHYYQLRPKQlKYBYATRiiU 

7562 cgttacttctcaatgtatggcacaaagagcaatcgagtagctacaccaaac^^ 

m o YFSMYGTKSNRVATPNLQTR KAWN * 

7646 ttaaaagaaccaaatattgtaggcacaatgagtaa^ 

561 LKEPNlVGTMSNDVLTRVKQIFSAGVii, 

7730 tggcatacgaatgatgttttgaattataaccaagacaacggagatgtatag 7780 

599 WHTMDVLNYNQDNGDV* 

182ORF002 atgattaagaaatatact! g C? actttgaaacaacaactgatc^ 
1 



u I K K Y T G DFETTTDLNDCRVWSWGVCD I 
gacaacgttgacaatatgacgttcggtttagaaatcgattctttttttgagtggtgtaaaatgcaaggcagcacagacatttat 
DNV DNMTFGLEIDSFFEWCKMQGSTDII 
"20 ttccacaacgaaaaatttgacggagagtttacgctttcatggttattcaaaaatggtttcaaatggtgtaaagaagcaaaagaa 
„ PHNEKFDGE FMLSWLFKNGFKWCKEAKE 

2I04 gatcgaacattccccacactcatatcaaatatgggtcaatggtatgctttggaaatttgttgggaagttaattacacaacaaca 
n<5 DR T FS T L I SNMGQWY A L E I CWE VNY TT T 

3 8 n cagg C aaaacgaaaaaagagaaa rt ^ 

agaggaagc r gc« t aggc^^ 

2908 a^-SS^^ 

2 -acgaaggagaacac^^^ 
3076 tatatcccaaccatccaagccaagcaaagttcattactcattcaaaacgaatatcttgaatcaagtgtaaacaagccaggagtc 



2236 



169 

2740 

197 

2824 

225 



338 



309 y I P T I Q V K Q S S L F I Q » B Y L I S S V N K L O V 
337 DELIDLTIji nvuu _________ .^^n^^GaacaccaCCqaaqqq 



337 
3244 



ARKA-CAKGHLNS IiY O H P 0 I « '„„„_.„«„„« 



3832 ggtggcgtggtattagtagacacaatgtttacaatcaaataa 3873 

561 GGVVLVDTMFTIK* 



561 

182ORF003 
11305 



RDVPVVT FLGSG ET T L * * * ...,«,,«itttaatqat:actcg< 



l:^:^ « - * H*™ rmrrvrrrr 

12061 
253 
12145 
281 
12229 



a- H4~™f~^ 

y Y K Q T I Q E A w L E K A F T s ataa _ aaacta ccaagaaaaagagct- 



1 

4710 
29 

4794 
57 

4878 
85 

4962 
5046 

5214 



225 

S382 aca* 
253 
5466 
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nEFPKKEGEESSNIVAVIVDSEWFMIYD 

S886 tcattagtaacattcacagctatcggaggtcaacaagcaacggttcttgtcacggctactcccgactaa S954 

42° SL VTFTAIGGQQATVLVTVTSD* 

; s :„ ~ e «.LLi t L t .«.~~ 9 .t ; «.^^ 

13659 tatctattgctatccgatgccttgaatggttggaaattttaa 13700 

Su, H^fh^rn'TT'^rrr,'— -rrrrrrr 

g„L,« e Lt 9 ,««c« 9 .» 9 C« 9 t«»««^^ 

;!?„ r 9 icLiL 9 >iiJ t ««L:^ 



57 
15247 
85 

15331 

113 

15415 

141 

15499 

169 

15583 

197 

15667 



281 
15919 

309 * « ~ 

16003 tttaataagatgaaaatctggtaa 16026 

337 FNKMKIW 



J,,, astgatgtagaagaaa^ 

^7 ggg'ccLggtt^gcgcaggggcagaagacggtcaaatcgatcattatcacaaccccattttctccacagcaaacgaagcaatg 
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85 GFMV- CAGAEDGQIDHYHNPIFFTANEAM 

8131 tatcacaagagatatcctgttttaagatatgatgatgatgatgataaatcaaaatgtatcatgttgtataataacgacttgaaa 

113 yhKRYPVLR YDDDDDKSKCIMLYNNDLK 

8215 qttcctacgtcaccaagtttacatcgttttgctttagatacggcggacataaaccagatatcacgagtgaatcgaagagcgcaa 

141 VPT LPS LHRFALDMADINQISRVNRRAQ 

8299 aaaacacctgtaattattcaaactgatgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcag 

169 ktpvIIQTDEKKYFSLLQAYNQI DENNQ 

8383 qctgtttttgtggataaagatatggagtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtagtagataaacta 

1S1 avfvdkdmefdesfnvwqtnapyvvdkl 

8467 cgatcagaactgaacgaagtatggaatgaagtgttaacttttctaggtatcaacaatgctaacgtagataagactgcacgtgta 

rseLNEVWNEVLTFLGINNANVDKTARV 

8551 caaacatcagaagtcttatctaacaatgaacagattgaaagttgaggtaacatcttgttaaaatcaagaaaagagttttgcgat 

253 qtSEVLSNNEQIESSGNILLKSRKEFCD 

8635 cgtgtaaatcgtgtctttggcgatgaacttgacggaaagattgacgtgaagtttagaacagacgccgtccgacaattacaactg 

281 rvnrvfgdeldgkidvkfrtdavrqlql 

8119 gcggcaggtcaatcaaaaaaagaccagatgagtggagggttgccaagtgctacttaa 8775 

309 AAGQSKKDQM SGGL PSAT * 
182ORF008 

1410S at9acgaatggtactgatatctcta9ctatcaaacag g aactgatctttcaaaagttccatgcgatttt9taaatattaaa9ca 

1 M M NO ID I S S Y Q T G I D LSKVPCDFVNI K A 

14189 acaggcggaacaggttatgtaaaccctgattgtgaccgagcacttcaacaagctttgtctttaggtaaaaagattggtgtgtat 

29 TG GTGYVNPDCDRAFQQALSLGKKIGVY 

14273 cattttgcgcatgagaggggttcagaaggtacacctcaacaagaagcgcaactctttttagataatactaagggttacattggt 

57 H FAHERGLEGTPQQEAQFFLDNIKGYIG 

143S7 aaagccgctcttactcttgactttgaagggtcaaatcagaaagatgtaaattgggcgaaagcatetcttgattatgtttataac 

8S kavLILDFEGSNQKDVNWAKAFLDYVYN 

14441 aaaacaggcgttaaagcatggtcttatacgtatacagcaaacctcaatacaactgatttttctagcattgcaaaaggcgattac 

113 ktgvkAWFYTYTAHLNTTDFSS IAK C D Y 

14S2S ggtttatgggttgctgaatatggaccaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaataatcccccaacc 

141 glwvaeygsnqpqgysqpappkt nnfp i 

14609 gttgcccgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttgaatgttttctatggcgacggt 

lfi9 vacfqftskgrlpgyngnldlnvfygdg 

14S93 aacacatgggatctgtatgcaggtaaaaaacaggatcaaattgtccctcctgaaaataaaatatttgacgccacaagtgatgag 

197 htwdlYVGKKQDQIVPPEN KIFDATSDE 

14777 tttattttcactcttacaacaggcagcacaagcgtgtttcattttgacggagaaacgaccctcgaattgtctgatccaacacaa 

22S FI ftLTTGSTSVFYFDGETIPELSDPTQ 

14861 ctcgaccatattagaggaacacacaatcatgttcatggaaaagaaaccccatcaatggtgtggacacctgaacaatttgatact 

2S3 ldHIRGTYNHVHGKEIPSMVWTPEQFDI 

14 945 tacttaaaaatgtatgaaaagaaaccagtatataaatag 14983 

281 YLKMYEKKPVYK* 
182ORF009 

8765 gtgctacttaaacgtcatattgaaagtttcacttattaccaacctgaattatctcgaaaagaacgtattgaagttggccgaaaa 

1 V LLKRYIESF TYYQPBLSRK ERIEVGRK 

8849 caattgtccgattttgattacccgttttatgacgaaacaaaacgagoagaacttgaaacaaaatctatcaatcacttttacttg 

QLFDFDYPFYDETKRAEFETKFIHHFY. il 

8933 agagagataggctcagaaacgatgggatcatttaagtttaatcttgacgaatatttaaatctaaacatgccctatcggaataaa 

S7 RE I G S E TMG S F KFH Ii D EYLNL NM PYWN K 

9017 atgttcctaccaaatctcgaagagcttccgatttttgatgacatggactacaccattgacgagaaacagaaattgttaaacgag 

85 mflsNLEEFPIFD DMDYTIDEKQKLLMB 

9101 attgatacaaacatcaaagcgaatcgtgatgaatcgaagaaccaaacgaagcaagtagatcaaacagacaacagaaacaaaaac 

113 iBtmikanrdesknqtkqvdqtdmrnkn 

9185 acacgtgacacaggaacaaccgactcttcctcaaggaacacttatacagacacccctcaaaaagatttgagaattgccagcaat 

141 trdtgttdsfsrntytdtp qkdlria s m 

9269 ggagatggaacaggtgtaatcaattatgcaacaaatatcacagaagatttgagtaaagaaacaacaagccccacaggcgttgaa 

169 G D G T G V I N Y A T N I T EDLS KETTS ST GV E 

9353 acaaacaacgacaaaacaaatcaaaatacacgaagcaatgcttctgaaaaagaaacaaagaacacagacatcaataaagatcaa 

197 TNMDKTNQMTRSN ASEKETKMTDINK D Q 

9437 aatcaaaccaaagatacgattacacgatataaaggtaaaaagggaaacactgattatgctgacttactcgaaaaatatcgtaga 

22s nqtkDTITRYKGKKGNTDYADLLEKYRR 

9521 agtgtttcgagaattgagaaaatgatctttagagaaatgaacaaggaaggcttatttctccttgtttatggagggaggtag 



9601 
2S3 



SVLRIEKMIFREMNKEGLFLLVYGGR 



1310 ccgaccgcaagaacatcaaagaatgatagagccaagctagagaaaatctacggtaaacctaacaaagctcgtaaaaaatacaat 

1 L TVR ISKNDRAKLEKIYGKSNKARKK YN 

1394 cgtttaagacaaaaaggagctgaggaaaggcaacttccaactgtcccaacatcaaagaaaagacttattgactacgtaaaatca 

29 RLRQKG VEERQLPTVPTSKKRLIDY V K S 

1478 acaaacatgagtcgtagcgattctaacaagacgccagacgagttggtagatttcgcacaaccttacaacgagaattacattttt 

57 TNM SRSDFMKMLDELVDFAQPYNENY I t 

1S62 gagatcaacaagcgaaacgttgcaatctcaagagcgcaaaccaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaa 

85 EIMKRNVAISRAQIKEAQIKTEQAQKAK 

1646 gaagaacactacaaagagcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaaccattttaacagag 

111 EEHYKELNKVEVKKPTENTIVTPTIIil » 

1730 tcaggtgctgacttaccttctcaagcaacaccagattctaacattgacgctctcacccccccagaaggagttcagtcctactta 
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225 
2066 
253 

2150 ttatga 2155 

281 L * 

182ORF011 
9607 



10111 gctacaggagatatttatcttaacattaaaggaacggagggtgtataa 
169 atgDIV LNIKGTEGV* 

182ORF012 gacaacaatttatatccaagtgctcgagcagaaaactcgtcagatttg 



10872 acggcaaataaaaatattcaaatgaaggatagcaacgacaataatc^ , ^ 

57 VGMLEGM I K r i _ .,„, aa n aa ^ oa t ttatat 



57 
11124 



11292 taa 11294 

141 

182ORP013 



AAG liS i a " * - „ a „ a r aart-aattcaacacaaaaaaqt 



10540 
29 

10624 



s™, U4 e « TTTT --~iTTrrrT*rr^ 10860 

U052 gattacgttaagaagtctttagacggaacactcaacagaaaggacgatattaaatga 14108 

u3 DY vKKFLDGTLMRKDDIK 

1190 caaaaatttgtaaatctacaagtggtttcactgtag 1225 

113 QKFVNLQVVSL* 
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TTOGGKOLILYIDYVTKEFVLTHDKYN 
VYLDSHCIMIAITKSMKSVEHXAt.y 



E I T Q G G KQ L I L ^L^l^rJ^LU 
16865 tat 

57 Y , - - - 

16781 catgacggatataaacaaattacggacaaatag 1674 9 

406 ttacaaaaactagtaaaataa 426 

85 L Q K L V K 



W aaaa t a g a 3 aJa t ^^ 

2 Js fi9 Ctattac^aat^ 

16485 tctcgcLtLgtcagcgttacaaataccaaaagg^ 16429 

lir"' a jaal ^aagaaca^ 

4407 catL^tgcgagaic^ 

2n gaCaacg^agcg^ 

57 DNEALVlSNSKLFRERAl vc 

4575 acagaccagaatattacactagacgatttaggaatttaa 4613 

1«410 „ t , C agtggcc.»t=etg.t59t«tecjtt«ege.g.g9t9t«« 1°<« 

13120 atattgacggtagcggtggcggtggcgtaa 13149 

11937 tgcttgagagatattagagaatag 11914 

6426 aatacacaacaacaaatgtaa 6446 

85 M T Q Q Q M * 
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18 2ORF025 

548 atqggtcgaaaactaatgcaacgaaacgtaacatcaactaaagtagaattctcagaagttatcgtacaagatggagcgccaaca 

1 M G R KLMQRNVTSTKVE F S EV I VQDGAP T 

632 attgtaccatgcgaaccagttgtcttaacaggaaaactttcagaagaaaaagctttatcagcgatcaaacgtaaaaaccctgat 

29 IVPCEPVVLTGKLSEEKALSAIKRKNPD 

716 aaaaacgtagttgtaacaaatgtttcacatgaaacagcgctttacacaatgccagtcgataaatttatcgagttagcagacaaa 

5 7 KNV VVTNVS HETALYT M P VD K F I E LA D K 

800 tcaacacaagcctaa 814 

85 S T Q A * 

13259 atggaaattatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatgaaacttttaggatcaagatttgtatc 

, . me I I WSAVS CMRAKK* LS T H E T FR I K I 

13175 cttgattggggttccatagcaacgttttacgccaccgccaccgctaccgtcaatatgcttactataagcttgtgcaagatcaag 

29 LDWGSIATFYATATATVNM,LT IS LCKI K 

13091 tctttcttcgatatgaagtttaccagggcgttcccagtgacacataaaattaattgtagcaacatcaatatttgcagattgttt 

57 SFFDMKFTRAFPVTHK I NCS N I N I C R L F 

13007 aaatgctga 12999 

85 K C * 
1 8 20RF 0 2 7 

14896 itgaacatg»ttgtat9ttcctctaatatgatcgagttgtgttggatcagacaattcaaa g atcgtttccccgtcaaaacaaaa 

! M NM IVCSSNMIELCWIRQFKDRFSVKI K 

14812 cacgcttgcgctacctgttgcaagagtgaaaataaacccatcacttgtggcgccaaatattttattttcaggaggaacaatctg 

„ HAC ATCCKSENKLITCGVKYFIFRRMNL 

14728 accctgttctttacctacatacagatcccatgtattaccatcgccatagaaaacattcaaatcaagattgccgttgtatcccgg 

57 IL FFT YIQIPCITIAIENIQIKIAVVSW 

14644 taa 14642 

8S 

1 8 2 ORF 0 28 

14430 atgtttataataaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaa 

1 M F I I KQALKHGFIRI QQTS I QLI FLVLQ 

aaggcgattacggtttatgggttgctgaacatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaata 

K A IMVYGLLNMDQINHKATLNQRHL K Q I 

attttccaattgttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttga 14672 

IFQLLPVFSLQVKDVYQDTTAILI 

tlZT^ atgaacgaaccgatcgtatacacagaaatttattcaaataacgtggtacgtatgaaaatttttagagatgaggat 

1 M M EPIVYTEIYSNM VVCMKIFRDEDKLS 

17S22 aaattcctctatttagaatttgaggtggatgaggctaaaaagttacttgaaaataaaacaatttcatttgatgataactggact 

29 KFLYLEFEVDEAKKLLENKTI sfddn wt 

17438 ttctcaataaattatccagaatattaa 17412 

57 FSINYPEY* 

llllf 030 atggctacattctacaaggaaccaatatacgatatcacagtattttatatagatggttgggaggttttgatacacaaaaccg^ 

1 M A TFYKEP IYDITVFYI DGWEVL IHKTE 

16345 cctctcaccttaacaaaagcattaaaatatagccgtatatacctagaaatggatatagtgaattgcgttagaatagaaagaaat 

29 PLTLT KALKYSRIYLEMDIVN C V R 

16261 ggacgtcctatagctacattttacagggaattattaaaactgtataaggagaaagaactatga 16199 

57 GRPIATFYRELLKLYKEKEL* 



14514 
29 

14598 
57, 
132ORF029 



atgttacctgaactttcaatctgttcattgttagataagacttctgatgtttgtacacgtgcagtcttatctacgttagcattg 
M LPELSICSL LDK TSDVCT RAVL S TL A L 
ttgatacctagaaaagttaacacttcattccatacttcgttcaattctgatcgtagtttatctactacatatggagcatttg^ 
L IPR KVNTSFHTSFNSDRSLSTTYGAFV 



182ORF031 
8603 
1 

8519 
29 

8435 tgccatacattaaaagattcgtcaaactccatatctttatccacaaaaacagcctga 8379 

57 C HTLKDSSNSISLSTKTA* 

III™" 2 atgtttcatcaaaaacaacttgtttcgggttcgtt^^^^ 

1 M PHQKQLVSOSFQGAIGHSPDPLLSSC 5 

11329 tttgaatatcaattcgttcttccatatgaacctcct.tattttagagggaaaacgcaattatctagggatagatattcgatttta 

29 FEYQFVLPYEPPYFRGKTQLSRDRYS I L 

11245 cctacattgtcatttacagttactttgccatcaggtgtgattgatacataa 11195 

57 PTLSFTVTLPSGVIDT* 

"11**°" atgtcaacaaaaatttcttcaatcgttcgacctaaaggcatgtttccttttttaaacattttcaaagggttacg 

! M STKISSIVRPKGMFPFLNIFKGLRQU *- 

4858 tatcggataactactttaccaatacggtcaactaaagttgaaataaattcgttttttactacgtctaaacgtgtga^ 

29 YR ITTLPI RSTKVEINS FFT TSKRVI PA 

4774 ccaaccgcttcgatgttatctgcatttggcataggtacgttcgcctga 4727 

57 PTASMLSAF GIGTFA* 

6160 RF034 gtgtttatctactctaaaaactcccccgagtcgtgta^ 

1 v fIYSKNSPELCIPLIRTISILVKNRKR 

6076 attaaagtacgattcctgttcctgttgagttttaa^^ 

29 IKV RFLFLLSFKPSCVCIGVIKRHVSWW 

5992 ttccacatttgtataccttcttgccataattgtcctccttag 5951 
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57 FYICIPSCHNCPP 
182ORP035 
1S758 



atggcgcataagaaactactatttttacttctcttttcaataaacgtatcactatcatcgacaaactcattgttgatactaaaa 
i M A H K K L L F L LLFS I N V S L S LTNS LLI L K 

15674 tcttcgtattctgttccacgaatcaatctaccaaaaggtg^ 

tl sS YSVPR I NLPKGVS LFTS A K S FESHMS 



29 

15590 atcaatatacctcgatcttga 15570 
57 INI P R S 

1820RF036 
2315 



atgtctgtgctgccttgcattttacaccactcaaaaaaagaatcgatttctaaaccgaacgtcatattgtcaacgttgtctata 

V" M S V L PC I LHHSKKES I SKPNVILSTLS I 

2231 tcgcatacgccccacgaccatacacgacaatcgttgagatcagttgttgtttcaaagtcgccagtatatttcttaatcataatt 
29 s hTPHDHTRQSLRSVVVS KSPVYFLIII 



2, - 

2147 cttctcctgtttctgaattaa 2127 

57 LLLFLN* 



12280 F ° 37 gtgagttacgacaataaacatctacatcaatataagcttgatcca^ 

i^.u ^VySnkhlhqykldphletqtkrfyfrm 

ctagaaaatggttgttgttttggacatgcaagttatgcgcatatcctcgataataacttacgccaccttcgattgtgttaccag 
LENGCCFGHAS YAHI LDNNLRHLRLCYQ 



12196 
2 9 

12112 aaatttcacagaaattaa 12095 

57 K F H R N 



\r rrTTTTTTTTTTTn'TT-rrrrrr rrrrrr 

14937 ttgatatttacttaa 149S1 
57 LIFT* 

9992 RP039 atgttgctgatgatcgaacattttggtataagattcaacgcgacaatactgattatggagccgatccta^ 

1 M L L MIE HFGIRFNATILIMEPIL LT J 

10076 ttgttgcaatcaataaagttagtggctggaataccgctacaggagatatttatcttaacattaaaggaa^ 

29 l l qSI KLVAGIPLQEIFILTLKE RRVYN 

10160 ggcagacattag 10171 

57 G R H * 

i6202 F ° 40 atgagaaaagatttcgtctacattaacacacccgatccaaaagcaaacaaaaaggcgttagcaaaaatcact^^ 

i M R KDFVYINTPDPKANKKALAKITNAK.c 

16118 ccaaaacaaaactatcgcagactacaattactatgttatctactattcaccattgtaatagaactaatcgtgg^ 

29 PKQNYRRLQLLCYLLFIIVIELIVVALL 

16034 aaatag 16029 

57 K * 

3886 RP ° 41 atggaactatataaagcaatgtttatcgtacgtgatgaaggtactattgacggttacgatactgaacactatgtag^ 
1 M ELYKAMFIVRDEGTIDGYDTEHYVDIS 

ttacatgactttgaagaaatatatggaaaagaaacacgtgaaattgaagcagtaacattagtaaaaacaggaaatttaaaaa^ 
L HDFEEIYGKETREIEAVTLVKTGNLKK 



3 970 
29 

4054 taa 4056 

57 



2 , sl l, T»S«LSSPVBTPL»IVTOIKll' 

^ ^aaaagc^ 

6373 cgaacaataaagttgagggtgttctttcgcaatgaaactctctcgtataccaatatcgaaacgataagttcgtaa 6299 

2 , RIIK LRVFFCNETFSYTNIETISL 

6 "r 04 ^gaa C ggca t ac= C gta t acgacg t ta=ata= r ^^^^^ 

1 MNGIPVYDVTYIPTILFKKGSFV v * 



S645 tactctccaaaattagcactgcctgccccacttggcttgtatacctccccacttgaattgacaggaagtaaataa 6571 

2 , yspKLALPAPFGLYTSPLELIGSK 

2nr° 4S atggtttcaaatggtgtaaagaagcaaaagaagatcgaacattctccacactcatatcaaatatgggtcaatggtatg 
1 MVSNGV KKQKKIEHSPHSYQIWV.N O H 

2< S6 aaatttgttgggaagctaactacacaacaacaaaatcaggtaaaacgaaaaaagagaaatctcgaacaataa 
KFV GKLITQQQMQVKRKKRMI.EQ 
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ill™ atgctcccattgttccaacatgtgttactgttccatcgcaacatgcaatcatttcattgccagggtgatcaattgaaccaaage 

1 M L P L FQHVLLFHRNMQ S FHCQGD Q L M Q S 

13 269 ccaaaccatcatggaaactatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatga 13201 

2 9 pMHHGNYLVCRFLHACQKVVHS* 

1 8 39S RF ° 48 atgtcagggtttgttccgaactttccatacaagctattcaacatacccttggcgttagcttttctagccccttcggtggtgttc 

1 M SG FVPNFPYKLFNIPLALAFLA P S V V F 

3311 tttacttcgatccatttatcgacccagcccttgaacatatcacaagaagctttgaacatatatccgtaa 3243 

„ FTSIHLSIQP LNISQEALNIYP* 

" 2 78 RF ° 49 atgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaagaagaacactacaaag 
! MLQSQERKSKKRKLKQSKLKKR K K N T T n. 

1«62 agcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaa 1724 

29 S LTKLKLRS PQ KTQLS HQLF 

8096 acaaccctattttctttacagcaaacgaagcaatgtatcacaagagatatcctgttttaa 8i55 

29 ttLFSLQQTKQCI TRDILP * 

1 MLLKKKQRTQT L I K I K I K. P J ' 

9474 aaaagggaaacactgattatgctgacttactcgaaaaatatcgtagaagtgttttga 9S30 

29 K R E T L I M LTYS KN I VE V F 

\IIT F0S2 gtgatagttgacaagagtcaaatttggcgagat^^ 

I V XVDKSQIWRDWANVHVKYRALPLSYGH 

4180 ataaacgttttgaccgtcaaccaatcgcaaaaaccttttaggagtagcccttaa 4233 

29 INVLTVNQSQ KPFRSSP* 

ssr* fTT * rrr rrr^^ 

15740 gtagtttcttatgcgccattgctttcgaagggaaaatcttcgggtattggatag 15793 
29 VVSYAPLLLK GKSLGIG* 



8052 gaacccaagtgtagggtctttaaagaaaccaagataaccattagtgtgtaa 8002 

29 EPKCRVFKE TKITISV* 



8408 agtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtag 8455 

29 SLTNLLMYGKQ MLHM* 

6633 attttggagagtacatggcgtttcttacaacgaaagaaccttttttaa 6680 

29 I LESTWRFLQRKNLF* 

8180 ga tttatcatcatcatcatcatatcttaaaacaggatatctcttgtga 8X33 

5092 gcaaatgctttttctaaccatgcttcttggatcgtttgtttgtag 5048 

29 VNAPSNHASWIVCL* 

1 MVFRSH.CIKMICIWLIIITHIU1A 

15792 tatccaatacccaaagattttcccttcaaaagcaatggcgcataa 15748 

29 YPI PKDFPFKSNGA* 

S^ 0, \t9.«ttt 9 .«tctc.^^ 

1 VlFDFSlKNSSNKIVRiov* 

15320 gtactaagtggaacagcccaacccattaatttatcatcacaatag 15276 

29 VLSGTA QPINLS SQ* 

HIT' 61 atgaggggacttctccacctgtttcagactcgatcacttttgcaatctcactgtaaacttgttcttttttctgttgtacttctg 
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M ROLLHLFQTRSLLQ SVC KLVLFSVVLL 

2018 ctecgtcataaatgtagtcaaggttcatgtctaagaagttactaa 1974 

I MSKKLLTYVFINKborxc 

1908 gaattgaaaatagtaaacatcgcttgtctgaaattgtcgtaa 1867 

14222 cacaatcagggtttacataacctgttccgcctgttgctttaa 3-4181 

! MMLVKPTKGLLL AKABRl A 

7272 ataccatgtctgaaagtattgcgaatgttttgctctcga 7234 

3498 cacaaaactagcaagcggaacataaacaggacctcttaa 3460 

4318 agattatggaaattaaagaacatgaatcaattttaa 4353 

^3798 atcattgcaaccattaaccatataatcaaaccataa 13763 

29 I IATI NHI I K P * 

7183 tttaactcctacatcgttaggatcattatcgattaa 7148 

4943 Ltgtcaacaaaaatttcttcaatcgttcgacctaa 4908 

947 tacgtaacggaattgaagccccgtttgtggcattga 912 

11825 actttgatttgtttgttcgtaactgtactttaa 11857 

11639 ttcaattcaatggtgttagcaaagcgataa 11610 
FNSMVLAKR* 



2792 ccatgtgtagcttttagccaatctttgtaa 2763 

iir m ~^4 u rrrr?? t rrrr?rrrrrn ,, ?Tr 

8839 aacttcaatacgttcttttcgagataa 8813 

7379 tgtttacttgttgtcctgctcccatga 7353 
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1 msVENVRSSFASLHHLKPFLNN HESINS 

2342 ccgtcaaatttttcgttgtggaaataa 2316 

29 PSNFSLWK* 
182ORF077 

11858 atgaaggaacgtatgttgttgttgctagaggtagaggggttacatttgaaaattgtctattctctaatatctctcaagcaatta 
1 M K E RM LLLLE VEG LHLK IVYS LI S LKQL 

11942 tcaaaacagcttttcccgatgtaa 11965 

29 SKQLFPM* 
182ORF078 

7671 gtgcctacaatatttggttcttttaatttaatgaaattccatgcttttcttgtttgtaagtttggtgtagctactcgattgctc 
1 VPTIFGSFNLMKFHAFLVCKFGVATRLL 
7587 tctgtgccatacattgagaagtaa 7564 

29 FVPY IEK* 

182ORF079 

7488 gtgaaagataagtttgatccaagctgtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaa 
1 vKDKFDPSCVTLSGIFSISATLPAKRFK 
7404 ccgttttctttttcagaaacatag 7381 

29 PFSFSET* 
182ORF080 

4473 gtgtgctatttgctgatgtcaaagcttcagttgttgctccgtagtcttctcgcaatgcttcaagatgttctacaatctttgatc 
1 vCYLL MSKLQLLLRSLLAMLQDVLQSLI 

4389 ttgcttcaccgtctgtga 4372 

29 L L H R L * 
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Table 24 

Sequence similarities phage 182 and public databases 

Phage: 182 
Database: nr 

Query* sid| 110156 | lan| 182ORF001 Phage 182 ORF| S966-77*80| 2 
(604 letters) 



e-105 
e-103 



8e-53 
8e-09 
6e-07 



8385. 
2833, 
1018 . 



gi I 138124 | sp| P07534 |VG9 BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >... 384 

gil 138123 |sp|P0433l|VG9~BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >... 374 

gi|l429238|gnl|PID|ell73412 (X99260) tail protein [Bacteriophag . . . 346 3e-94 

gi|215339 (M12456) p9 tail protein [Bacteriophage phi-29] >gi|2... 208 fle-53 

gi|H81970|gnl|PID|e221269 (Z47794) tail protein (Bacteriophage... 62 

qi|ll81968|gnl PID|e221267 (Z47794) tail protein [Bacteriophage... 56 

gi|2500030|sp|Q59968|CARA^SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. . . 49 8e-0S 

Query* sid| 110157 | lan| 182ORF002 Phage 182 ORF| 2152-3873 | 1 
(573 letters) 

gi|H8848|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 | pir| | JQO .. . 
gi|l429230|gnl|PID|ell73404 (X99260) DNA polymerase (Bacterioph. . . 
gi 118849|sp|P03680|DPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP... 
gilll88Sl|sp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP . . . 
gi|l5732 (X53371) DNA polymerase (AA 1-575) (Bacteriophage phi-29] 
gi I 15734 (X53370) DNA polymerase (AA 1-575) (Bacteriophage phi-29] 
gi|l572479|gnl|PID)e242301 (X96987) DNA polymerase (Bacteriopha. . . 
gi|1072656|pir| |SS1275 DNA polymerase - phage CP-1 >gi | 836593 | g .. . 
gi|H8847|sp|P22374|DPOM ASCIM PROBABLE DNA POLYMERASE >gi 
gi|461962|sp|P33537|DPOM - NEUCR PROBABLE DNA POLYMERASE >gi 
gi|461963|sp|P33538|DPOM__NEUIN PROBABLE DNA POLYMERASE >gi 
gi | 1084487 |pir| [S41618 DNA polymerase - slime mold (Physarura po... 
gi|2435429 (AF012250) unassigned reading frame (possible DNA po. . 
gi|578157|gnl|PlD|e246743 (X52106) DNA polymerase [Neurospora i... 
gi|2147969|pir| (S72369 probable DNA- polymerase - Gelasinospora ... 
gi|2147968|pir| |S62752 probable DNA -polymerase - Gelasinospora .. 
gi I 3511140 (AF061244) B type DNA polymerase (Agrocybe aegerita] 
gi 1 118850 | sp| P10479 | DPOL_BPPRD DNA POLYMERASE (PROTEIN PI) >gi|.« 
gi|578144 (X63909) putative DNA- polymerase, B-type [Morchella c. 
gi|232013|sp|P30322|DPOM_AGABT PROBABLE DNA POLYMERASE >gi|3208.. 

Query* sid) 110159 | lan| 182ORF004 Phage 182 ORF| 4626 -5954 | 3 
(442 letters) 

gi 1 138117 |sp|P13849|VG3_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN ... 
gi|l38118|sp|P0753l|VG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN .. 
gi|1429236|gnl|PID|ell734l0 (X99260) major head protein (Bacter. . 
gi|H81958|gnl|PID|e221257 (Z47794) major head protein [Bacteri. . . 152 6e«36 

Query* sid| 110160 | lan| 182ORF005 Phage 182 ORF | 12651-13700 | 3 
(349 letters) 

gi|l37932|sp|PlS132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR... 52 8e-06 

qi|l429242|gtil|PID|ell73416 (X99260) morphogenesis protein [Bac. . . 48 7e-05 

gi|l37933|sp|P07538|VG13 - BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR... 47 2e-04 

Query* sid| 110161 1 lan.| 182ORF006 Phage 182 0RF| 14995-16026 1 1 
(343 letters) 

qi|l37944|sp|P11014|VG16 BPPH2 ENCAP S I DAT I ON PROTEIN (LATE PROT. . . 402 e-111 

gi 13794S sp|P0754l|VG16:BPPZA ENCAPSIDATION PROTEIN (LATE PROT... 402 e-111 

gi|l429245|gnl|PID|ell73419 (X99260) encapsidat ion protein [Bac... 381 e-105 

gi|H81972|gnl|PID|e221271 (Z47794) encapsidat ion protein (Bact... 159 2e-^a 



665 


0.0 


657 


0.0 


654 


0.0 


654 


0.0 


651 


0.0 


651 


0.0 


565 


e-160 


301 


le-80 


71 


3e-ll 


65 


le-09 


62 


le-08 


61 


3e-08 


61 


3e-08 


59 


le-07 


58 


2e-07 


58 


2e-07 


57 


3e-07 


56 


6e-07 


47 


3e-04 


46 


6e-04 


309 


2e-83 


305 


3e-82 


300 


le-80 



Query* sid| 110162 | lan| 1820RF007 Phage 182 ORF| 7795-8775 | 1 
(326 letters) 
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qi|l429239|gnl|PID|ell73413 (X99260) upper collar protein [Bact . . 
gi|l3791S|sp|P07S35|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 
gi| 137914 |sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 
gi|H81960|gnl|PID|e221259 (Z47794) connector protein (Bacterio. . 

Query= sid| 110163 |lan| 182ORF008 Phagel82 ORF | 14105-14983) 2 
(292 letters) 

4210750|gnl| PID|el374037 (AJ132604) LysL protein (Lactococcu. . 
462559 | 3p| P34020 | LYC_CLOAB AUTO LYTIC LYSOZYME (1 , 4 -BETA-N-AC . . 
2327014 (U82823) putative lysozytne (Saccharopolyspora erythr. . 
126652 |sp|P25310|LYCM_STRGL LYSOZYME Ml PRECURSOR (1,4-BETA-.. 
127789 |sp|P19386|LYCA~BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE . . 
67761 | pir| |MUBPCP N-acetylmuramoyl-L- alanine amidase (EC 3.5.. 
4105636 (AF04 9087) lys [Leuconostoc oenos bacteriophage 10MC] 
623084 (L02496) muramidase; muramidase (Bacteriophage LL-H] 
127787|sp|P15057|LYCA_BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE. 
126597 sp|P0072l|LYCH_CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME. 
127788 |sp|P1938S|LYCA_BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 
67762 | pir | MUBPC7 N-acetylmuramoyl -L-alanine amidase (EC 3.5. 
302S168|sp|P7642l|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN. 
4204413 (AF047001) Lys44 (Oenococcus oeni temperate bacterio. 
2116978|gnl|PID|dl020940 (D881S1) cortical fragment -lytic en. 
2392844 (AF011378) lysin (Bacteriophage ski] 



271 5e-72 

256 le-67 

256 2e-67 

148 6e-35 



gil 

gil 

gil 

gil 

gil 

gil 

gil 

gil 

gil 

gil 

gi 

gil 

gil 

gil 

gi 



Query* sid| 110164 | Ian) 182ORF009 Phage 182 ORF| 8765 -9601 | 2 
(278 letters) 



gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 
gil 



1429240|gnl|PID|ell73414 (X99260) lower collar protein (Bact... 
13792l|sp|P04333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. .-. 
215341 (M12456) pll lower collar protein (Bacteriophage phi-29] 
224162 |prf | | 1011232B protein pll, lower collar (Bacteriophage... 
535260 (Z30339) STARP antigen (Plasmodium reichenowi] 
4049753 (AF063866) ORF MS V2 30 hypothetical protein (Melanopl... 
21315S7lpir| |S70306 hypothetical protein YEL077C - yeast (Sa. . . 
131782 |sp|P12753|RA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD... 
2131309 | pir | |S70305 hypothetical protein YBL113C - yeast (Sa... 
499325 (Z26314) STARP antigen (Plasmodium falciparum] 
3845171 (AE001391) riboaome releasing factor (OO, TP) (Plasm... 
731903 | Sp| P40434 | YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN... 
1632829|gnl|PID|e276379 (Y08924) AARP2 protein (Plasmodium f... 
1176490|sp|P40889|YJW5_YEAST HYPOTHETICAL 197.6 KD PROTEIN I... 
1077300|pir| |S51848 hypothetical protein HRD10S4 - yeast (Sa. . . 
2425143 (AF020407) WimA (Dictyoatelium discoideum] 
118196l|gnl|PID|e221260 (Z47794) collar protein (Bacteriopha . . . 
2132657|pir| [S64819 probable membrane protein YLL067c - yeas... 
213304l|pir| |S65341 probable membrane protein YPR204w - yeas... 
730275 |sp|P39793|PBPA_BACSU PENICILLIN -BINDING PROTEINS 1A/1... 



Query* sid| 110165 | lan| 1820RF010 Phage 182 ORF | 1310-2155 | 2 
(281 letters) 

gi|l35604|sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN >gi | 75815 | pi .. . 
gi|l572478|gnl|PID|e242334 (X96987) terminal protein [Bacteriop. . . 
gi|l42923l|gnl|PID|ell73405 (X99260) terminal protein (Bacterio... 

Query* sid| 110166 | lanj 182ORF011 Phage 182 ORF| 9607-10158 | 1 
(183 letters) 

gi 1 13792 8 1 sp | P07537 | VG12_BPPZA PRE-NECK APPENDAGE PROTEIN (LATE.. 
qili42924l|gnl|PID|ell73415 (X99260) pre-neck appendage protein., 
gi 137927|sp|P20345|VG12 BPPH2 PRE -NECK APPENDAGE PROTEIN (LATE.. 



139 
75 
64 
60 
60 
59 
59 
57 
57 
57 
57 
56 
S3 
53 
52 
48 



180 
171 
98 
97 
50 
49 
48 
48 
47 
46 
46 
45 
45 
45 
45 
45 
45 
45 
45 
45 



2e-32 
8e-13 
2e-09 
2e-08 
2e-08 
3e-08 
3e-08 
le-07 
2e-07 
2e-07 
2e-07 
3e-07 
2e-06 
3e-06 
5e-06 
8e-05 



le-44 
5e-42 
9e-20 
le-19 
le-05 
4e-05 
5e-05 
7e-05 
2e-04 
3e-04 
3e-04 
5e-04 
5e«04 
5e-04 
5e-04 
6e-04 
6e-04 
8e-04 
8e-04 
8e-04 



69 3e-ll 
65 3e-10 
64 le-09 



51 6e-06 
51 6e-06 
SO le-05 



Query* aid | 110169 | lan| 182ORF014 Phage 182 ORF| 13716-14108 | 3 
(130 letters) 

qi|137936|sp|P11188|VG14 BPPH2 LYSIS PROTEIN (LATE PROTEIN GPU. 
gi 137938| sp| P07539 1 VG14 BPPZA LYSIS PROTEIN (LATE PROTEIN GPU. 
gi|l429243|gnl|PID|ell734l7 (X99260) lysis protein (Bacteriopha. 
gi|215332 (M14782) lysis protein [Bacteriophage phi-29] 

Query* sid| 110170 | lan| 182ORF015 Phage 182 ORF| 854 -1225 | 2 
(123 letters) 



97 6e-20 

96 8e-20 

96 8e-20 

94 5e-19 
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qill5670 {V01155) reading frame 10 {maybe gene 4) [Bacteriopha. . . 
gi|l38072|sp|P06953|VGSAJ3PPZA EARLY PROTEIN GP5A >gi | 7S836 | pir . . . 

Query= sid| 110174 | lan| 1B2ORF019 Phage 182 ORF| 4323-4613 | 3 
(96 letters) 

Qill429235|gnl|PID|ell73409 (X99260) head morphogenesis P^ein. . . 
ai 13811l|sp|P13848|VG7 BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 
gij 138112 | Sp | P07533 | VG7~BPPZA HEAD MORPHOGENESIS PROTEIN (LATE 

Query- sid| 110180 | lan| 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) * 

gi|l38099lsp|P0695S|VG6_BPPZA EARLY PROTEIN GP6 >gi | 7S841 | pir | | . . . 
gi 138098 SPP03685 | VG6.BPPH2 EARLY PROTEIN GPS >gx | 75840 | pir | | . . . 
gi|l429234|gnl|PID|ell73408 (X99260) gene 6 product (Bactenoph. . . 



70 5e-12 
69 7e-12 



61 2e-09 
57 3e-08 
54 le-07 



55 7e-08 
54 2e-07 
54 2e-07 
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Table 25 

Homologies between 182 ORFs and proteins in public databases 



Phage: 182 
Database: Swissprot 

Query* aid] 110156 1 lan| 182ORF001 Phage 182 ORF | 5966 -7780 1 2 
(604 letters) 



e-106 



Qill38124|sp|P07534|VG9 BPPZA TAIL PROTEIN {LATE PROTEIN GP9) 384 
qi 138123 s P |P04331|VG9:BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 374 e-103 

gi| 2500030 |sp|QS9968|CARA_SULSO CARBAMOYL - PHOSPHATE SYNTHASE SM. . . 49 2e-05 

Query- sid| 110157 1 lan| 182ORF002 Phage 182 ORF| 2152-3873 | 1 
(573 letters) 



665 0.0 



71 7e-12 

65 3e-10 

62 3e-09 

56 2e-07 

46 2e-04 

46 2e-04 



gi| 118848 |sp|P19894|DPOL_BPM2 DNA POLYMERASE 

qi 118849 sp|P03680|DPOL BPPH2 DNA POLYMERASE (EARLY PROTEIN GP2). 654 0.0 
qi 118851| sp|PQ6950|DPOLJ3PPZA DNA POLYMERASE (EARLY PROTEIN GP2) 654 0.0 

gi| 118847 |sp|P22374|DPOM_ASCIM PROBABLE DNA POLYMERASE 
gi 1461962 I sp|P33537|DPOM_NEUCR PROBABLE DNA POLYMERASE 
qi 461963lsp|P3353 8|DPOM NEUIN PROBABLE DNA POLYMERASE 
gi 118850|sp|P10479|DPOL BPPRD DNA POLYMERASE (PROTEIN PI) 
qi|232013|sp|P30322|DPOM"AGABT PROBABLE DNA POLYMERASE 
gi| 118887 |sp|P10582|DPOM~MAIZE DNA POLYMERASE (S-l DNA ORF 3) 

Query- sid| 110159 | lan| 182ORF004 Phage 182 ORF) 4626-5954 | 3 
(442 letters) 

qi|l38117|sp|P13849|VG8 BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN ... 309 6e-84 
gijl38118|sp|P07S3l|VG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN ... 30S 7e-83 

Query- sid| 110160 | lan| 182ORF00S Phage 182 ORF| 12651-13700 | 3 
(349 letters) 

qi 1 137932 I sp I P15132 i VG13 BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR. 
gi|l37933|sp|P07S38|VG13>PPZA MORPHOGENESIS PROTEIN 1 (LATE PR... 47 

Query- sid| 110161 1 lan| 182ORF006 Phage 182 ORF| 14995-16026 1 1 
(343 letters) 

qill37945|sp|P0754l|VG16 BPPZA ENCAPS IDATION PROTEIN (LATE PROT . . . 402 
gi|l37944|sp|P11014|VG16>PPH2 ENCAP S I DAT I ON PROTEIN (LATE PROT. 402 

Query- sid| 110162 | lan| 182ORF007 Phage 182 ORF| 7795-8775 1 1 
(326 letters) 

qil 137915 |sp|P07S3S|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR . 
gi| 137914 |sp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR . 

Query- sid| 110163 1 lan| 182ORF008 Phage 182 ORF| 14105-14983 | 2 
(292 letters) 



52 2e-06 
6e-0S 



e-112 
402 e-112 



256 3e-68 
256 Se-68 



75 2e-13 



qi I 462559 1 Spl P34020 | LYC — CLOAB AUTOLYTIC LYSOZYME (1 , 4 -BETA-N-AC . . . 

gi 126652 sp I P25310 1 LYCM STRGL LYSOZYME Ml PRECURSOR (1, 4 -BETA- . 60 5e-09 

qi 127789 sp P19386 | LYCA BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 60 Se-09 

gi 127787 sp P1S0S7 | LYCA~BPCP1 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 57 4e-08 

qi 126597 spP0072l| LYCH~CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME. 57 4e-08 

gi 127788 sp P19385 LYCa"bPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 57 5e-08 

li|3025168|sp|P7642l|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN ' IN . . . 53 5e-07 

Query- sid| 110164 | lan| 182ORF009 Phage 182 ORF| 8765- 9601 1 2 
(278 letters) 

137921 |sp|PQ4333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE . . . 171 le-42 

131782|sp|P12753lRA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD. . . 48 2e-05 

11764901 sp|P40889|YJWS^YEAST HYPOTHETICAL 197.6 KD PROTEIN I... 45 le-04 
73-1903|sp|P40434|YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN.. 
73 0275 1 Sp I P39793 PBPA_BACSU PENICILLIN-BINDING PROTEINS 1A/1.. 
ll68610|sp|P4l696|AZFl_YEAST ASPARAGINE-RICH ZINC FINGER PRO.. 



gi 

gi 

gi 
gi 



45 le-04 
45 2e-04 
44 3e-04 
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69 8e-12 



51 2e-06 
50 3e-06 



gi|731587|sp|P38900|YH19_YEAST HYPOTHETICAL 70.1 KD PROTEIN IN ... 44 3e-04 

Query- sid| 110165 | lan| 182ORF010 Phage 182 ORF| 1310-2155 | 2 
(281 letters) 

gi| 135604 | sp | P06 8 12 | TERM_BPNF DNA TERMINAL PROTEIN 

Query* aid| 110166 | lan| 182ORF011 Phage 182 ORF | 9607-10158 | 1 
(183 letters) 

qill37928|sp|P07537|VG12_BPPZA P RE-NECK APPENDAGE PROTEIN (LATE.. 
gi| 137927 |sp|P20345|VG12_BPPH2 P RE-NECK APPENDAGE PROTEIN (LATE.. 

Query* sid| 110169 | lan| 182ORF014 Phage 182 ORF| 13716-14108 | 3 
(130 letters) 

qi|137936lsp|P11188|VG14 BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 97 2e-20 

liil37938|sp|P07539|VG14:BPPZA LYSIS PROTEIN (LATE PROTEIN GP14) 96 2e-20 

Query= sid| 110170 | lan| 182ORF015 Phage 182 ORF | 854-1225 | 2 
(123 letters) 

gi| 138072 |sp|P06953|VG5A_BPPZA EARLY PROTEIN GP5A 

Query* sid| 110174 | lan| 182ORF019 Phage 182 ORF | 4323 -4613 | 3 
(96 letters) 

qill3811l|sp|P13848|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE .. 
gi|l38112|sp|P07533|VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE .. 

Query* sid| 110180 | lan| 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

gi|l38099|sp|P06955|VG6_BPPZA EARLY PROTEIN GP6 
gij 1380 98 | sp j P03685 |VG6_BPPH2 EARLY PROTEIN GP6 



69 2e-12 



57 9e-09 
54 4e-08 



55 2e-08 
54 5e-08 
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BLASTP 2.0.8 [Jan-05 - 1999] 



Query* sid| 110156 | lan| 182ORF001 Phage 182 ORF| 5966-7780 | 2 
(604 letters) 

>gi|l38l24|sp|P07534|VG9 - BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>g i|75849|.pirT|WMBP9Z gene 9 protein - phage PZA 
>gi | 216058 (M11813) tail protein {Bacteriophage PZA] 
Length =599 

Score = 384 bits (975), Expect = e-105 , (e ■ ^ ^ 

Identities * 231/610 (37%), Positives = 344/610 (55%), Gaps = 36/610 (5%) 

Query 6 TNVKLIJUWPFDNTYTHTRWFKTQQEQESYFNSFPVLNENRDCSYQR^ 65 

TNV++LA+VPF N Y +TRWF + Q ++FNS + E ++Q ♦ V 
Sbjct: 9 TNV^ILADVPFSNDYKNTRWFTSSSNQYNWFNSKTRVYEMSKVTFQGFRENKSYISVS^ 68 

Ouerv 66 KDALYACNYLIFKNEETYPSKWQYAFVTDIEYKNDNTSFVTFEIDVLQTYRFDIGIRESF 125 

D LY +Y++F+N + Y +KW YAFVT +- + EYKN T ++ V FEIDVLQT+ F+I *ESF 
Sbjct: 69 LDLLYNAS Y IMFQNAD - YGNKWFYAFVTE LEYKNVGTTYVHF E I DVLQTWMFN I KFQES F 127 

Ouerv- 126 IAKEHPQLYYSNGIPFINTIEESI^YGREYTTTNVTTFH^ 183 

I + EH +L+ +G P INTI+B L+YG EY W P D ♦ FLV+++ M G 
Sbjct: 128 I VREHVKLWNDDGT PT I NT I D EG LNYG S E YD I VS VENHR P YDDMMF L W I S KS I KHGT AG 187 

Ouerv- 184 DKEDKSG GSIVGGPSPFSYYLLPINSSGEVYKPN-GAGNANFGEYMAFLT TKEP 236 

^ Y \ + E + S+ G P P YY+ P G+V K G NAN * LT ++> 

Sbjct: 188 EAESRLNDINASLNGMPQPLCYYIHPFYKIX^ 247 

Ouerv 237 FLNKIVGMYVTSYTGIPFIVDHANKTVTIYNAGGSYKIMLPTYASDPTGTMKTFAF 296 

+ N IV MYVT Y G+ + +K ++ + + + A D G + T VK* 

Sbjct: 248 AVNNIWMYVTDYIGLKLDYKNGDKELKLDKDMFEQAGI - - -ADDKHGNVDTIF- - -VKK 301 

Ouerv 297 ARTFVPKRIDLVGNVWFREAFPFNVIOSSKLFMYPYCLI^^ 356 
^ Y * + ID G+ + F + +ESKL MYPYC+ E+TD KG+ M L> EY+ 

Sbjct: 302 I PDYETLEID -TGDKWGGFTKD QESKLMMYPYCVTEVTDFKGNHMNLKTEYIDNN 355 

Ouerv 357 KLSVYVKGSLGISNKVMIEPIDYDVSNSTI ITNLSDKMLIDNDPNDVGVKSDYASA 412 

^ KL + V+GSLG+SNKV DY+ S +T D LI+N+PND* ♦ +DY S A 

Sbjct: 356 KLKIQV11GSLGVSNKVAYSIQDYNAGGSLSGGDRLTASLDTSLINNNPNDIAIINDYLSA 415 

Ouerv 413 FMQGNKNSLIAQEQNIRNTFRHGMGNSAMSTGGAIFSALASNNPFVGLTNIMGAGQQVNN 472 

++QGNKNSL Q+ +1 GM +S G ++ +PF " +++ G N 

Sbjct: 416 YLQGNKNSLENQKS S ILFNGI VGMLGGGVSAG AS AVGRS PFGLAS S VTGMTSTAGN 471 

Ouerv 473 YVSEKENGLNLLAGKVADIENIPDNVTQLGSNLS FTTGN-FQNYYQLRFKQIKYEYATRL 531 
W Y ' v + * L KADI NIP +T++G N +F GN ♦ ♦ Y KQ+K EY L 
Sbjct: 472 AVLD MQALQAKQAD I ANI P PQLTKMGGNTAFD YGNGYRGVYVI K - KQLKAE YRRSL 526 

Ouerv 532 DRYFSMYGTKSNRVATPNLQTRKAWNFIKLKEPNIVGTMSNDVLTRVTCQIFSAGVTLWHT 591 

+F YG K NRV PNL+TRKA+N+I+ K+ I G ++N + L IF GVTLWHT 

Sbjct: 527 SSFFHKYGYKINRVXKPNIJITRKAYNYIQTKDCFISGDINNNDLQEIRTIFDNGITLWHT 586 

Query: 592 NDVLNYNQDN 601 

+D+ NY+ +N 
Sbjct: 587 DDIGNYSVEN 596 



Query- sid 1 110157 1 lan| 182ORF002 Phage 182 ORF | 2152-3873 1 1 
(573 letters) 

>qi|H8848|sp|P19894|DPOL BPM2 DNA POLYMERASE >gi | 76896 | pir | | JQ0161 
DNA-directed dSa polymerase (EC 2.7.7.7) - phage M2 
>gi|215509 (M33144) DNA polymerase (Bacteriophage M2] 
Length =572 

Score = 665 bits (1697), Expect =0.0 , 0 / C aa 
Identities » 327/589 (55%),. Positives = 420/589 (70%), Gaps = 38/589 (6%) 

Ouerv 3 KKYTGD F ETTTD LNDCRVWS WGVCD I DNVDNMT FGLE I D S F FEWCKMQGSTD I YFHN EKF 62 
K ++ DFETTT L+DCRVW++G +1 N+DN G +D F +W M+ D+YFHN KF 
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Sbjct: 4 KMFSCDFETTTKLDDCRVWAYGYMEIGNUJNYKIGNSLDEFMQWV-MEIQADLYFHNLKF 62 

Query 6 3 DGEFMLSWLFKNGFKWCKEAKEDRTFSTLISNMGQVTYALEICWEV>TYXXXXXXXXXXXXX 122 

DG F+++WL ++GFKW E + T++T*IS MGQWY ++IC* 
Sbjct: 63 DGAFIVOTLEQHGFKWSNEGLPN-TYNTIISKMGQWYMIDICFGYK GKRKL 112 

Ouerv 123 XXIIYDSLKKYPFPVKQIAEAFNFPIKKGEIDYTKERPIGYKPTKDEWEYLKNDIQIMAM 182 

+IYDSLKK PFPVK+IAt F P + KG-t-IDY ERP+G++ T +B*EY*IMDI*I+A 
Sbjct: 113 HTVIYDSLKKLPFPVKKIAKDFQ.LPLLKGDIDYHTERPVGHEITPEEYEYIKNDIEIIAR 172 

Query: 183 ALK IQPD<*3LTWirWWD^^ 242 

AL IQF QGL RMT GSD+L +KD L F + FP XSL DK++RKAY+GGFT 

Sbjct: 173 ALDIQFKQGLDRMTAGSDSLKGFKDILST KKFNKVFPKLSLPMDKEIRKAYRGGFT 228 

Query: 243 WVNKVFQGKEIGDGIVFDVNSLYPSQ^PLPYGTPLFYEGEYKP^YPLYIQNIl^ 302 

W+N ++ KEIG+G+VFDVNSLYPSQMY RPLPYG P* ++G+Y* + YPLYIQ 1+ 
Sbjct: 229 WLNDKYKEKEIGEGMVFDVNSLYPSQMYSRPLPYGAPIVFQGKYEKDEQYPLYIQRIRFE 288 

Ouerv 303 FRLKEGYIPTIQVKQSSLFIQNEYLESSVNKLGVDELIDLTLTNVDLELFFEHYDILEIH 362 
Query. 303 LKEGYIPTIQ+K+-+ F NEYL++S GV E ++L LTNVDLEL EHY++^+- 
Sbjct: 289 FELKEGYIPTIQIKKNPFFKGNEYLKNS- GV-EPVELYLTNVDLELIQEHYELYNVE 343 

On*rv 363 YTYGYMFKASCDMFKGWIDKWIEVKNTTEGARKANAKGMLNSLYGKFGTNPDITGKVPYM 422 

Query. 363 ™°™^ s ^ ^ ^ MUJSLyGKF +NPD+TGKVP y 

Sbjct: 344 YIDGFKFREKTGLFKDFIDKWTYVKTHEEGAKKQLAKLMLNSLYGKFASNPDVTGlCVPYIi 403 

Query: 423 GEDGIVW/TU5EEELRDPVYVPLASFVTAWGR™ 482 

+DG + +G+EE +DPVY P+ F+TAW R+TTIT AQ C+DRIIYCDTDSIHL GTE 
Sbjct: 404 KDDGSLGFRVGDEEYKD PVYTPMGVF ITAWARFTT ITAAQACYDRI I YCDTDS IHLTGTE 463 

Ouerv 483 VPEAI DH LVD P KKLGYWGHE ST FQRAKF I RQKT YVEEIDGEL 524 

VPE I +VDPKKLGYW HESTF+RAK++RQKT YV+E+DG+L 

sbjcC; 4g4 VPEIIKDIVDPKKLGYWAHESTFKRAKYLRQKTYIQDIYVKEVDGKLKECSPDEATTTKF S23 

Ouerv 525 NVKCAGMPDRIKEIVTFDNFEVGFSSYGKLLPKRTQGGWLVDTMFTIK 573 

. +VKCAGM D IK* VTFDNF VGFSS GK P + GGWLVD++FTIK 

Sbjct: 524 SVKCAGMTDTIKKKVTFDNFAVGFSSMGKPKPVQVNGGWLVDSVFTIK 572 

Query- sid| 110159 1 lan 1 182ORF004 Phage 182 ORF| 4626-5954 | 3 
(442 letters) 

>qi|138117| 3 p|P13849|VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN GP6) 
>gi|7S845|pir| |WMBP89 gene 8 protein - phage phi-29 
>gi| 215325 (M14782) major head protein (Bacteriophage 
phi-29] >gi|225362|prf||l301270B gene 8 (Bacillus sp.l 
Length » 448 

Nicies 0 ! ^^O^^osi^es" 250/440 (56*,. Gaps - 27/440 (.%, 

Query: 4 KITSQOVXRATN^ " 
Sbjct: 2 RITFNDVKTSLGITESYDIVNAIRNSQGDNFKSYVPLATANNVAEVGAGILINQTVQNDF 61 

Query 64 IS TLVDRIGKWIRYKSWRNPLKMFKKGNMPLGRTIEEIFVDIAQEHKFNPDESVTGVFK 123 
Query. I++LVDRlG WIR S NPLK FKKG +PLGRTIEEI+ DI *B *+* *E* VF* 

Sbjct: 62 ITSLVDRIGLWIRQVSLNNPLKKFKKGQIPLGRTIEEIYTDITKEKQYDAEEAEQKVFE 121 

Query: 124 QEVPDVKTLFHEINREGYYXQTIQEAWLEKAFTSVroNFNSFVAGVMNALYTGDEVSEF^ 183 

*E*P*VKTLFHE NR+G+Y QTIQ+ L+ AF SW NF SFV+ +*NA+Y EV E*EY 
Sbjct: 122 REMPNVKTLFHERNRQGFYHQTIQDDSLKTAFVSWGNFESFVSSIINAIYNSAEVDEYEY 181 

Query: 184 T KLLIANYQEKELFKEIEIGEITESNA--KEFIRKIKSTSNiaEF M --SSAYNAQ 3 VKTS 239 

KtU MY K LF ++I E T S EF ++K+++T+ KL S +N+ v+i- 

Sbjct: 182 Mi^LVDbTTfSKGLFTTVKIDEPTSSTGALTEFVKKMRATARKLTLPQGSRDWNSMAVRTR 241 

Query 240 TSKS DQYXXXXXXXXXXXXXXXXXXXFNMSKTDFVGHKIVIDEFPKKEGEESSNIVAVIV 299 
Query. «u ia fNM**TDF+G+ VID F S+ + AV*V 

Sbjct: 242 S YMEDLHLI IDADLEAELDVDVLAKAFNMNRTDFLGNVTVIDGF ASTGLEAVLV 295 

Query 300 DSEWFMIYDKLYKTTSLYNPEGLYWNYWLHHHQLYSTSQFGNAVAFVKSATKPVTKVAFA 359 

D +WFM+YD L+K ** NP GLYWNY* HQS S*F NAVAFV VT+V * 

Sbjct: 296 DKDWFMVYDNLHKMETVRNPRGLYWNYYYHVWQTLSVSRFANAVAFVSGDVPAVTQVIVS 3S5 

Query: 360 SATTSVVKGSSKDIALTFTPVEATNQQGEWSSAPALVKATVKQTAGKATAVTVEGLEVG 419 
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+V +G + V ATN + V V G +T + G 

Sbjct: 356 PN I AAVKQGGQQQ FT AYVRATNAKDHKV - - VWSVEGGSTGTAI TG 398 

Query: 420 QSLVTFTAIGGQQATVLVTV 439 

L++ + Q TV TV 

Sbjct: 399 DGLLSVSGNEDNQLTVKATV 418 

Query- sid | 110160 | lan| 182ORF005 Phage 182 ORF| 12651-13700 | 3 
(349 letters) 

>gi|137932|9p|Pl5132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE 

PROTEIN GP13) >gi|758S8|pir| |WMBP23 gene 13 protein - 
phage phi-29 >gi| 215331 (M14782) morphogenesis protein 

(Bacteriophage phi-29] >gi | 225368 |prf | | 1301270H gene 13 

(Bacteriophage phi-29] 
Length » 36 5 

Score = 51.5 bits (121), Expect = 8e-06 

Identities = 44/166 (26%), Positives = 70/166 (41%), Gaps » 14/166 (8%) 

Ouerv 6 NEQ I ARGQT I AKI LS KYG YNKNSQ VG WANLHWE S A GLNPNSNEXXXXXXXXX - QWT 61 

Y * >E q i LS G+ K ♦ G++ N+ ES GL N +E QWT 

Sbjct: 12 SEMKVNAQYIL^LSSNGWTKQAICGMUJNMQSESTINPGLWQNUDEGNTSLGFGLVQWT 71 

Query • 62 PKSNLYRQAQICGLSNAKAETLEGQAEIIAQGDKTGQWMDNTPVSSAGYTNPQTLSAFKQ 121 

P SN A GL II ♦ + QW++ V K 
Sbjct: 72 PASNYINWANSQGLPYKDMDS - - ELKRI I WEVNNNAQWINLRDMTFKEY IKS 121 

Query* 122 SANIDVATINFMCHWERPGKLHIEERLDLAQAYSKHIDGSGGGGVK 167 

+ + p + +ERP + ER D Af + K++ G GGGG++ 

Sbjct: 122 TKTPRELAMIFLASYERPANPNQPERGDQAEYWYKNLSGGGGGGLQ 167 

Query,- sid) 110161 1 lan 1 182ORF006 Phage 182 0RF| 14995-16026 1 1 
(343 letters) 

>qill37945|sp|P07S4l|VG16_BPPZA ENCAPSIDATION PROTEIN (LATE PROTEIN 
GP16) >gi | 75861 | pir | |WMBP16 gene 16 protein - phage PZA 
>gi| 216065 (M11813) morphogenesis protein C 
(Bacteriophage PZA] 
Length =332 

Score = 402 bits (1023). Expect = e-111 

Identities = 186/332 (56%), Positives = 244/332 (73%), Gaps - 2/332 (0%) 
Query 11 EKNLYYNP^AI^FNCLMLFVIGARGIGKTYGYKKFV^FIKHGEQFIYLRRFKTEL^ 70 

+K+L+YNP L ++ ++ FVIGARGIGK+Y K + +NRFIK+GEQFIY+RR+K EL K 
Sbjct: 2 DKSLFYNPQK>ILSYDRILNFVIGARGIGK^YAMKV^PINRFIK^GEQFIYVRRYKPE^ 61 

Ouerv 71 I PQ F F KTMAKE F P D HKLEVKG KE FYCDD KLMGWAV P LS TWG I E KSNE Y P EVRT I LFD E FL 
W ^* + +F +A+EFPDH+L VKG+ FY D KL GWA+PLS W EKSN YP V TI+FDEF+ 

Sbjct: 62 VSNtfFNDVAQEFPDHELVVKGRRreiDGKIAGWAIP * 21 

Ouerv 131 IEKSKITYLPNEAEALLNMMETVFRRRTNTRCVMLSNATSVVNPYF 190 
U Y ' EK Y+PNE AIiLN+M+TVFR R RC+ LSNA SWKPYFL+FNL PD+NKRFN 

Sbjct: 122 REKDNSNYIPNEVSALLNL^TVTRNRERWCICLSNA "1- 

Ouerv 191 LYQDRGILIELCDSKDFAETORETPFGRLIRGTEYEDFSINNEFVNDSDTFIEKRSKNSS 250 

+ y D LIE+ DS DF+ +RVT FGRLI GTEY + S++N+F+ DS FIEKRSK+-S 
Sbjct: 182 VYDD- -ALIEIPDSLDFSSERRKTRFGRLIDGTEYGEMSLDNQFIGDSHVFIEKRSKDSK 239 

Ouerv 251 FLCAIAFEGKIFGYWIDAETGCVTVSYI)YQPNTNHFYA>riTKDHEENRLL^^ 310 
^ ^' F+ +I + G G W+ D G +YV + P+T + Y +TT D EN +L+ N++NNY+L 

Sbjct: 240 FVFSIVYNGFTLGWVT)WQGLMYVI)TAHDPSTKNW 299 

Query: 311 STVAKAFKNSYLRFDNIVIKNLHYDLFNKMKI 342 

+A AF N YLRFDN VI+N+ Y+LF KM+I 
Sbjct: 300 RKLASAFMNGYLRFDNQVIRNIAYELFRKMRI 331 

Query- sid| 110162 | lan| 182ORF007 Phage 182 ORF| 7795-8775 | 1 
(326 letters) 

>gi|l429239|emb|CAA67658| (X99260) upper collar protein 
[Bacteriophage B103] 



130 
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Length * 308 
Score =» 271 bits (685), Expect - 6e-72 

Identities = 131/27S (47%), Positives = 187/275 (67%), Gaps * S/27S (1%) 

Query 36 YYEHYRRQLTLLTFQLFEWENLPKSIDPRYLEIALHTNGYLGFFKDPTLGFMVCAGAEDG 95 

+Y HY + L L +QLFEWE LP S+DP YLE ++H GY+GF+KDP +G++ C GA G 
Sbjct: 22 WYYHYYQYLCSLAYQLFEWERLPPSVDPSYLEKSIHQFGYVGFYKDPRIGYIACQGALSG 81 

Query 96 QlDHYHNPIFFTANEAhfYHKRYPVLRYDDDDDKSKCIMLYNNDLKVPTLPSliiRFAl^^ 155 

>DHY+ P F A+ Y + + Y D +K+ + +YNNDLK TLP+L FA D+A 
Sbjct: 82 TVDHYNLPDRFHASSVGYQbTTFKLYNYSDMKEKNMGVAIYNNDLKCSTLPALEMFAQDLA 141 

Query 156 DINQISRVNRRAQKTPVIIQTDEKKYFSLLQAYNQIDENNQAVFVDKDMEFDESFNVWQT 215 

+ + +1 VN+ AQKTPV+I + + SL YNQ + N + FV + ++ D + V++T 
Sbjct: 142 ELKEIIAVNQNAQKTPVLIAANDNNQLSLKNIYNQYEGNAPVIFVHESLDLD-NLKVFKT 200 

Query 216 NAPYVVDKIiRSELNEVVWEVLTFIX3I^A^KTARVQTSEvLSNNEQIESSGNILLKSR 275 

+APYWDKL ++ N VWNEV+T+LGI NAN++K R+ TSEV SN+EQIESSGNI LK+R 
Sbjct: 201 DAPYVVDKXNAQKNAVWNEVMTYLGIKN^^ 2S0 

Query: 276 KEFCDRVNRVFGDELDGKIDVKFRTDAVRQLQLAA 310 

+E C++++ ++G L VXFR D V Q++L A 

Sbjct: 261 QEACNKISELYGLNL KVKF R YD I VEQMRLN A 291 

Query. sid| 110163 | lan| 182ORF008 Phage 182 ORF | 14105 -14983 | 2 
(292 letters) 

>gi|42107SO|emb|CAA10710| (AJ132604) LysL protein [Lactococcus 
lactisl 
Length - 235 

Score - 139 bits (347), Expect = 2e-32 tc *\ 
Identities = 85/210 (40%) , Positives = 114/210 (53%) , Gaps - 14/210 (6%) 

Query 2 MNGIDISSYQTGIDLSKVPCDFVNIKATGGTGYVNPDCDRAFQQALSLGKKIGVYHFAHE 61 

MNGIDISSYQ VP DFV IKAT GT Y+NP + Q + K +G YHFA 

Sbjct: 1 MNG I D I S S YQ AE LNAG IVPSDFVII KAT EGTNY I N PTWEEQAGQ V I QTUKLIC F YHF AS - 59 

Query 62 RGLEGT PQQ EAQ F FLDN I KG Y I GKA VL I LD F EG S - - NQ KD VNW AKAF LD YVYNKTG VKAW 119 

G P EA FF+ +K YIGKAVL+LDFE - N A+ FL+ V KTG+ 

Sbjct: 60 - --VGNPIAEADFFISVVKNYIGKAV^VIiDFEAGAINAWGNVGARQFLNRvKEKTGINPM 116 

Query • 120 FYTYTANI^DFSSIAKGDYGLWVAEYGSNQPQGYSQPAPPKTNN FPIVACFQF 174 

Y ♦ ++S+I+ + LWVA+Y S P GY + P T+ + A Q+ 

Sbjct: 117 IYMSSDVTRQFNWSTISSTN- PLWVAQYASMNPTGYQ- -SEPWTDGKGYGAWSSAAIHQY 173 

Query: 175 T S KGRL PG YNGNLD LNVFYGDGNTWD LYVG 204 

+S G L ++GNLD+N+ ' Y + N W G 
Sbjct: 174 SSAGSLSNWSGNLDINLAYINANQWKSLAG 203 



Query- sid| 110164 | lan| 182ORF009 Phage 182 ORF| 8765-9601 | 2 
(278 letters) 

>gi|l429240|emb|CAA676S9| (X99260) lower collar protein 
[Bacteriophage B103] 
Length =293 

Score = 180 bits (451), Expect = le-44 

Identities = 115/296 (38%), Positives * 161/296 (53%). Gaps * 33/296 (11%) 
Ouerv 3 LKRYIESFTYYQPELSRKERIEVGRKQLFDFDYPFYDETKRAEFETKFINHFYLREIGSE 62 

L YIE ++ Y+ LS E+IE GR +LFDF YP >DE+ R FET FI +FY+REIG E 
Sbjct: 8 LSTYIEMWSQYETGLSMAEKIEKGRPKLFDFQYPIFDESYRKVFETHFIRNFYMREIGFE 67 

Query: 63 TMGS FKFNLDEYLNLNMPYWNKMFLSNLEEF - PI FDDMDYTIDEKQKLLNEIDTNIKANR 121 

T G FKFNL+ +L +NMPY+NK+F S L ++ P+ ♦ T K+ DT NR 

Sbjct: 68 TEGLFKFNLETWLIINMPYFKKLFESELIKYDPLENTRLNTTGNKKN DTERNDNR 122 

Ouerv- 122 D - -ESKNQTKQVDQTDNRNKOTRDTGTT DSFSRNTYTDTPQKDLRIASNG 169 

D + K+ TK D+T+ ■ + D TT D+F + R *D P L + +N 
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Sbjct: 123 DTTGSMKAIXSKSNTKTSDKTNATGSSKEDGKTTGSVTDDNFNRKIDSDQPDSRL^TTN^ 181 

Query 170 DGTGVINYATNITEDLSKETTSSTGVETNNDKTNQNTRSNAS EKETKNTD 219 

DG G f YA+ 1 + ++TG TNN +♦ ♦ S S T N 

Sbjct: 182 DGQGTLEYASAIEENNTNNKRNTTG- -TNNVTSSAESESTGSGTSDTVTTDNANTTTNDK 239 

Query • 220 INKDQNQTKDTITRYKGKKGNTDYADLLEKYRRSVLRIEKMIFREMNKEGLFLLVY 275 

+N N +D I GK G YA L+ + YR ++LRIEK IF EM + LF+LVY 
Sbjct: 240 LNSQINNVEDYIESKIGKSGTQSYASLVQDYRAALLRIEKRIFDEMQE-LFMLVY 293 

Query- sid| 110165 | lan| 182ORF010 Phage 182 ORF| 1310-2155 | 2 
(281 letters) 

>gi|l3S604|sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN 

>g i|75815|pir|7ERBPNP terminal protein - phage NF 
>gi|579177|emb|CAA68440| (Y00363) gene E product (AA 
1-267) (Bacteriophage NF] 
Length =266 

Score * 74.9 bits (181), Expect = 6e-13 

Identities = 73/275 (26%), Positives = 129/275 (46%), Gaps = 37/275 (13%) 

Query: 3 VT^ISKNDRAKLEKIYGKSNKARKKT!fNRLRQK-GVE ERQLPTVPTSKKRLIDYVKSTN 58 

+RH-ND+A K+ K+ KA K +R ++K G++ E +LP ♦ + + 
Sbjct: 7 IRITNNDKALYAKLV-KNTKA--KISRTKKKYGIDLSNEIELPPLESFQ 52 

Query 59 MSRSDFNKMU>ELvT)FAQPYNENYIFEINKRNVAISRAQ^ 

+R +FNK + F N+NY F NK + S+A+I E . T++AQ+ + E +E 
Sbjct: 53 -TREEFNKWKQKQESFTNRANQ^QFVra^ HI 

Ouerv- 119 L - -NKV^VKKPTENTIVTPTILTELGADLPFQAIPDFNIDAFTSPEGVQSYLEN 170 

^ y " + K ♦ I++P+ +T G P DFN D S +++ E 

Sbjct: 112 IEDKPFISGGKQQGTVGQRMQILSPSQVT - - GISRP - - --SDFNFDDVRSYARLRTLEEG 165 

Query- 171 IG-KQDEQYFDERDQLYYDNFRQAMFTIFNSD- -ADDIVRLLDSMGLDLFMKTYVSNFliD 227 

+ K Y+D R + NF + + FNSD +D++V L + DF + Y+ F + 
Sbjct: 166 MAEKAS PDYYDRRMTQMHQNFI E I VE KS FNSDWLSDELVERLKKI P PDDFFELYLM - FDE 224 

Query: 228 MNLDYIYDEAEVQQKKEQVYSKIAKVIESETGGEV 262 

+ + +Y EE + E + +KI ++ G+V 
Sbjct: 225 ISFEYFDSEGEDVEASEAMLNKIHSYLDRYERGDV 259 
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Query- sid 1 110166 | lan| 182ORF011 Phage 182 ORF| 9607-10158 1 1 
(183 letters) 

>gi| 1429241 |emb|CAA67660| (X99260) pre-neck appendage protein 
(Bacteriophage B103] 
Length =860 

Score = 50.8 bits (119), Expect » 6e-06 

Identities = 29/105 (27%), Positives = 56/105 (52%), Gaps = 6/105 (5%) 

Ouerv- 8 KRFDGLPAVFKERFSKYPHTEYRYELLIiDEEVSALIAYIJfEVGA 

' +RF+ L + + + +YT+ +L E+++ +1 YLN++G L ND+ N *E 

Sbjct: 7 RRFEIUXSEMMVQVYERYLPTAFDES^^ 66 

Query: 68 V-EKLEEITNDTLKKWLSDGTLENLINDTVFANYIKEIKRLQILV 111 

+ + LE+ +TL+KW +G +L+ I + V 

Sbjct: 67 LNDGLEDYVKETLEKWYEEGKFADLV IQVIDELKQFGVSV 106 

Query- sid| 110169 | lan| 182ORF014 Phage 182 ORF| 13716-14108 | 3 
(130 letters) 

>gi| 137936 |sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 
J ! >gi|75860|pir| |WMBP29 gene 14 protein - phage phi-29 

>gi|lS678|emb|CAA2863l| (X04962) gene 14 product (AA 
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1-393) (Bacteriophage phi-29] >gi| 225369 | pre | | 1301270J 
gene 14 [Bacteriophage phi-29) 
Length =131 

Score = 96.7 bits (237), Expect = 6e-20 

Identities = 53/131 (40%). Positives = 81/131 (61*). Gaps = 3/131 (2%) 

0 ,, prv . , MiEVlTQWL-AXJDtmLVYGLIIWLWAMIIDFVWrriAKFNKEIDFSSFKAl^GIIvOT 59 
Query. 1 MIEYITQ^ ^ ^ ^ m++q ^ AK N I FSSFK K G ++ +KV 

Sbjct: 3 MIAWMQHFLETDETKLIYWLT-FLMVCMVVDTVIX3VLFAKLNPNIKFSSFKIKTGVLIKV 61 

Ouerv 60 AEMVL WYF I P VAVKFGAVG I TMYITMLVGLI LS EIYSILGHISDI DDONNWTD YVKKFL 119 
Query, bo L +SEIYSI GH+ +DD ++♦ + ♦+ F 

Sbjct = 62 SE^LALLAlpFAVPFPA-GLPLLVTVrrALCVSEtYSIFGHLRLVDDKSDFLEILENFF 120 

Query: 120 DGTLNRKDDIK 130 

T + + K 
Sbjct: 121 KRTSGKNKEEK 131 

Query- sid| 110170 1 lan| 182ORF015 Phage 182 ORF| 854 -1225 1 2- 
(123 letters) 

>gi|l5670|emb|CAA24483| (V01155) reading frame 10 (may be gene 4) 
[Bacteriophage phi-29j 
Length =• 124 

S5££-l ^i 1 i 1 02;, E To C s t itI V :r^64/119 (53%). Gaps = 3/119 (2.) 
Query ! 3 ivKSTFDTQTPEGMLQVFNATNGASIPLRNAI -GEVLELKDILVYSDEVSGFGGAEPSQA 61 

sbjC t= 6 %£££££^^ 

Query: 62 j^J^^^^^^^ *^K*^ ^V*G^S*"o^^y^^^2* 

Sbjct: 64 TVTTIFAAMSLYSAISKTVAEAASDLIDLVTRHKLETFKVKVVQGTSSKGNVFFSIjQIj 



Query- sid| 110174 | lan| 182ORF019 Phage 182 ORF| 4323-4613 | 3 
(96 letters) 

>gi|142923S|emb|CAA676S4| (X99260) head morphogenesis protein 
(Bacteriophage B103] 
Length = 101 

SSi^! ;i t /96 ( r3 5 5l.*. E Si^: 0 l3/96 (54*,. Gaps . 5/96 (5*, 

Query: 1 -KE^LE™ " 
Sbjct: 3 ^W)SHEElLMKIilDPELEHSERTEL---I^LRADYGSVLSEFSELTSATEKLRAElISD 59 

Ouerv 61 LVISNSKLFRERAIVEPAEN- -NEPETDQNITLDDL 94 

L++SNSKLFR+ I + E + E * IT++DL 
Sbjct: 60 LIVSNSKLFRQVGITKEKEEEIKQEELSETITIEDL 95 

Query. sid| 110180 | lan| 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

>gi|l38099|sp|P0695S|VG6 BPPZA EARLY PROTEIH GP6 
91 >gi 7S84l|pir||ERBP6Z gene 6 protein - phage PZA 

>gi 216047 (M11813) gene 6 product (Bacteriophage PZA] 
>gi j 224746 1 prf | | Hi2171K ORF 6 (Bacteriophage PZAl 
Length = 96 

Score - 55.0 bits (130). Expect - 8e-08 
Identities = 28/79 (35*). Positives = 45/79 (S6*) 
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nuerv- 4 KLMQRNVTSTKVEFSEVIVQIXjAPTIVPCEPVVLTGK^ 

K+MQR + T T V + DG + G LS E+A +KRK + V V + 

Sbjct ■ 3 KMMQREITKTTVNVAKWVMVDGEVQVEQLPS 



6 
6 



Query: 64 VSHETALYTMPVDKFIELA 82 

V T +Y +PV+KF+E+A 
Sbjct: 63 VEPNTEVYELPVEKFLEVA 81 
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Table 26 

Secondary structure prediction for ORF 182ORF008 

1 MMNGIDISSY QTGIDLSKVP CDFVNIKATG GTGYVNPDCD RAFQQALSLG KKIGVYHFAH 
CCCCCCCCCC CCCCCCCCCC CCEEEEEECC CCCCCCCCCC HHHHHHHHHC CCCCEEEEEE 

6i ergleSpqq eaqffldnik gyigkavlil dfegsnqkdv nwakafldyv ynktgvkawf 

CCCCCCCCHH HHHHHHHHHC CCCCEEEEEE CCCCCCCHHH HHHHHHHHHH HCCCCCEEEE 
121 YTYTANLNTT DFSSIAKGDY GLWVAEYGSN QPQGYSQPAP PKTNNFPIVA CFQFTSKGRL 

EEECCCCCCC CCCEECCCCC CEEEEECCCC CCCCCCCCCC CCCCCCCEEE EEEECCCCCC 
181 PGYNGNLDLN VFYGDGNTWD LYVGKKQDQI VPPENKIFDA TSDEFIFTLT TGSTSVFYFD 

CCCCCCCCEE EEECCCCCCB EEECCCCCCC CCCCCCCCCC CCCEEEEEEC CCCCEEEECC 
241 GETIFELSDP TQLDHIRGTY NHVHGKEIPS MVWTPEQFDI YLKMYEKKPV YK 

CCEEEECCCC CCHHHHCCEE CCCCCCEECC CCCCCCCHHH HHHHHCCCCE EC 



Secondary structure prediction for ORF 182ORF014 

1 MIEYITQWLA DDNHLVYGLI IWLMVAMIID FVLGFTIAKF NKEIDFSSFK AKAGIIVKVA 
CCCCEECCCC CCCCHHHHHH HHHHHHHHHH HHHHHHHHHC CCCCCHHHHH HHHCEEEEEE 
61 S5S55SS SSSIl TMYITMLVGL ILSEIYSILG HISDIDDDNN WTD YVKKFLD 
eSeECC CEEECCCEEE EEEEEEEEEE EEEEEEEECC CCCCCCCCCC CEEEEEEECC 

121 GTLNRKDDIK 
CCCCCCCEEC 
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Table 27 

Enterococcus accession numbers 242/242 



gi|289575 1 |gb| AF044978. 1 |AF044978 [289575 1] 

gi|4803755|dbj|AB026843.1|AB026843 [4803755] 

gi|476900 1 jgb|AF 140549. 1 1 AF 1 40549 [476900 1 ] 

gi|476090 1 |gb| AF099088. 1 |AF099088 [476090 1 ] 

gi|4704705 |gb| AF 1 2 1 254. 1 1 AF 1 2 1 254 [4704705] 

gi|3342U7|gb|AF076604.1|AF076604 [3342117] 

gi|4688824|emb| AJ 1 32470. 1 |ESP 1 32470 
[4688824] 

gi|4732085|gb|AF125553.1|AF125553 [4732085] 

gi|4732082|gb|AF125552.1|AF125552 [4732082] 

gi|4732079|gb|AF125551.1|AF125551 [4732079] 

gi|4732076|gb|AF125550.1|AF125550 [4732076] 

gi|4732073|gb|AF125548.1|AF125548 [4732073] 

gi|4732070|gb|AF 1 25547. 1|AF 1 25547 [4732070] 

gi|4732067|gb|AF125546.1|AF125546 [4732067] 

gi|47320641gb| AF 1 25545 . 1 |AF 1 25545 [4732064] 

gi|4732061|gb|AF125544.1|AF125544 [4732061] 

gi|4704653|gb|AFl 147 15. 1|AF1 147 15 [4704653] 

gi|4704564|gb|AF102550.1|AF102550 [4704564] 

gi|4688827|emb|AJ238249.1|EFA238249 
[4688827] 

gi|4680606|gb| AF 1 25 1 98. 1 |AF 1 25 198 [4680606] 
gi|4633279|gb|AFl 17609. 1|AF1 17609 [4633279] 
gi|4633 124|gb|AFl 10130.1|AF1 10130 [4633124] 
gi[4 590399|gb| AF 1 24258.1 |AF 1 24258 [4590399] 
gi|4 5 903 3 6jgb|AF 1 083 80. 1 |AF 1 083 80 [4590336] 
gi|4590335|gb|AF108379.1|AF108379 [4590335] 
gi|40 1 9 1 67|gb|U2 1 300. 1 |CXU2 1300 [40 19 1 67] 
gi|4545 1 22|gb|AF0778 1 6. 1|AF077 8 1 6 [4545 1 22] 
gi|44336 1 0|gb|AF 1066 14. t|AF 106614 [4433610] 

gi|446883 8|emb| AJ 1 32039. 1 |EFA 1 32039 
[4468838] 

gi!4468 1 2 1 |emb| A J 1 3 295 8. 1 (BPH 132958 
[4468121] 

gi|4456104|emb|Y17302.1|EHI17302 [4456104] 
gil443 36 1 1 |gb| AF 1 066 1 5. 1 [AF 1 066 1 5 [4433611] 
gi|4433607|gb|AF10661 1.1|AF10661 1 [4433607] 



gi|4098267|gb|U766 14.1 |BLU766 1 4 [4098267] 
gi|47019|emb|Y001 16.1|SFAMB1 [47019] 
gi|4 1581 79|emb|AL035206. 1 |SC9B5 [4158179] 
gi|4165458|emb|X79343.1|EF16SSPA [4165458] 
gi|4 1 65457|emb|X79342. 1 |EFTRNALA [4 165457] 
gi|4 1 65456|emb|X7934 1 . 1 |EF23SRN A [4 1 65456] 
gi|4150978|emb|Y14027.1|EFY14027 [4150978] 
gi|4127803|emb|AJ223161.1|EFAJ3161 [4127803] 
gi|2956685|emb|Yl64l3.1|EFENTIJO [2956685] 
gi|2665346|emb|Yl 3922. 1 |EHY 13922 [2665346] 
gi|4324675|gb|AF109375.1|AF109375 [4324675] 
gi|4234627|gb| AF06 1013.1 1 AF061 0 13 [4234627] 
gi|4234626|gb|AF061012.1|AF061012 [4234626] 
gi|4234625|gb| AF06 10 1 1 . 1 |AF0610 1 1 [4234625] 
gi|4234624|gb| AF06 1010.1 |AF06 10 10 [4234624] 
gi|4234623|gb|AF061009.1|AF061009 [4234623] 
gi|4234622|gb| AF06 1 008. 1 1 AF06 1008 [4234622] 
gi|423462 1 |gb| AF06 1 007. 1 |AF06 1 007 [423462 1 ] 
gi|4234620|gb|AF06 1 006. 1 1 AF06 1006 [4234620] 
gi|4234619|gb|AF061005.1|AF061005 [4234619] 
gi|42346 1 8|gb|AF06 1004. 1 |AF061004 [423461 8] 
gi|42346 1 7|gb| AF06 1 003 . 1 |AF06 1003 [4234617] 
gi|42346l6|gb|AF061002.1|AF061002 [4234616] 

gi|42346 l5|gb|AF06 100 1 . 1 |AF061001 [4234615] 
gi|42346l4|gb|AFO6100O.l|AF061000 [4234614] 
gi|3 1 38990|gb| AF06024 1 . 1 |AF06024l [3138990] 
gi|3 1 38986|gb|AF060240. 1 |AF060240 [3 1 38986] 
gi|4204535|gb|AF094803.1|AF094803 [4204535] 
gi|4204534|gb|AF094802.1|AF094802 [4204534] 

gi|4204533 |gb| AF09480 1 . 1 1 AF094801 [4204533 ] 
gi|4204532|gb|AF094800.1|AF094800 [4204532] 

gi|420453 1 |gb| AF094799. 1 |AF094799 [420453 1 ] 
gi|4204530|gb|AF094798.1|AF094798 [4204530] 
gi|4204529|gb|AF094797.1|AF094797 [4204529] 
gi|4204528|gb|AF094796.1|AF094796 [4204528] 
gi|4204527|gb|AF094795.1|AF094795 [4204527] 
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gi|4204526|gb|AF094794. 1|AF094794 [4204526] 
gi|4204525|gb|AF094793.1|AF094793 [4204525] 
gi|4204524|gb|AF094792.1|AF094792 [4204524] 
gi|4204523|gb|AF09479 1 . 1 |AF09479 1 [4204523] 
gi|4204522|gb| AF094790. 1 |AF094790 [4204522] 
gi|420452 1 |gb|AF094789. 1|AF094789 [420452 1] 
gi|4204520|gb|AF094788.1|AF094788 [4204520] 
gi|42045 19|gb|AF094787. l|AF094787 [42045 19] 
gi|42045 18|gb|AF094786.1|AF094786 [42045 18] 
gi|4204517|gb|AF094785.1|AF094785 [4204517] 
gi|42045 1 6|gb| AF094784. 1 |AF094784 [42045 1 6] 
gi|4204515|gb|AF094783.1|AF094783 [4204515] 
gi|4204514|gb|AF094782.1|AF094782 [4204514] 
gi|4204513|gb|AF094781.1|AF094781 [4204513] 
gi|42045 1 2|gb| AF094780. 1 1 AF094780 (42045 1 2] 
gi|3 873 1 86|gb| AF034779. 1| AF034779 [3873186] 
gi|4 1 5 1 367|gb| AF093508. 1 1 AF093508 [4 1 5 1 367] 
gi|2828 1 36|gb|AF039903. 1 |AF039903 [2828 1 36] 
gi|2828135|gb|AF039902.1|AF039902 [2828135] 
gi|2828 1 34|gb| AF03990 1 . 1 j AF03 990 1 [2828134] 
gi|2828133|gb|AF039900.1|AF039900 [2828133] 
gi[2828 1 32|gb| AF039899. 1|AF039899 [2828132] 
gi|2828 1 3 1 |gb| AF039898. 1 |AF039898 [2828 1 3 1 ] 
gi|4 1 03866|gb| AF0288 1 2. 1 1 AF0288 1 2 [4 1 03866] 
gi|4 1 03864|gb| AF0288 11.1 |AF0288 1 1 [4 1 03864] 
gi|2605925|gb|AF029727.1|AF029727 [2605925] 
gi|1402750|gb|U60038.1|EFU60038 [1402750] 
gi| 1835780|gb|U86375. 1 |EFU86375 [1835780] 
gi|383 1 555|gb| AF047608. 1 1 AF047608 [3831555] 
gi|37906 17|gb|AF0974 14. 1 |AF0974 14 [37906 17] 
gi|3767587|dbj|AB0O5O36.1|ABOO5036 [3767587] 
gi|37578 10|gb|AF042288. 1 |AF042288 [37578 10] 
gi|3747039|gb|AF093509.1|AF093509 [3747039] 
gi|3 6605 5 9|dbj | ABO 1 7 8 1 1 . 1 1 ABO 17811 [3660559] 
gi| 1 147743|gb|U422 11.1 |EHU422 1 1 [1 147743] 
gi|36764 1 2|gb| AF05 19 17. 1 1 AF05 19 1 7 [36764 12] 

gi|3 676 1 64|emb| A JO 1 1 1 1 3 . 1 |EF AO 1 1 1 1 3 
[3676164] 

gi|26 1 2869|gb| AF005726. 1 1 AF005726 [26 1 2869] 
gi|2353762|gb|AF0 16233. 1 |AFO 16233 [2353762] 



gi|2149899|gb|U94707.1|EFU94707 [2149899] 
gi|2 1 49 1 49|gb|U82366. 1 |LSU82366 [2 149 149] 
gi| 1469463|gb|U495 12.1 |EFU495 12 [1469463] 
gi|1244503|gb|U35366.1|EFU35366 [1244503] 
gi|833854|gb|U26268.1|EFU26268 [833854] 
gi|84 1 200|gb|U 1 893 1 . 1 |CPU1 893 1 [84 1 200] 
gi|460079|gb|U00457.l|U00457 [460079] 
gi|460077|gb|U00456. 1 |U0045 6 [460077] 
gi|535661|gb|L34675.1|INSTRANSPO [535661] 
gi|302304 1 |gb|AF007787. 1|AF007787 [302304 1] 
gi|43U24|gb|L15633.1|TRN916ENT [431124] 
gi|388106|gb|L23802.1|ENEEBSA [388106] 
gi|36083 87|gb|AF07 1 085. 1 1 AF07 1 085 [3608387] 
gi|355 185 l|gb|AF076027. 1|AF076027 [3551851] 
gi|3551773|gb|U94770.1|SPU94770 [3551773] 
gi|3551743|gb|U57498.1|ECU57498 [3551743] 
gi]3 243 1 78|gb| AF063 010.1 1 AF063 0 1 0 [3243 178] 
gi|3 136316|gb|AF063900.1|AF063900 [3136316] 
gi|3540256|gb|AF052459.1|AF052459 [3540256] 
gi|7552 1 5|gb|U 1 7696. 1 |LLU 1 7696 [7552 1 5] 
gi|3421437|gb|AF082295.1|AF082295 [3421437] 
gi|3421436|gb|AF082294.1|AF082294 [3421436] 
gi|3421435|gb|AF082293.1|AF082293 [3421435] 
gi|3 42 1 434|gb| AF082292. 1 |AF082292 [342 1 434] 
gi|3341430|emb|Y17797.1|EFY17797 [3341430] 
gi|33 19647|emb|X69092. 1|EHPBP3RA [33 19647] 
gi|3292886|emb|AJ007584.1|EFA7584 [3292886] 
gi|3261536|emb|AL021958.1|MTV041 [3261536] 
gi|3250708|emb|Z95 150. 1|MTCY164 [3250708] 
gi|3249688|gb|AF070678.1|AF070678 [3249688] 
gi|3249687|gb|AF070677. 1|AF070677 [3249687] 
gi|3 249686|gb| AF070676. 1 |AF070676 [3249686] 
gi|3219158|dbj|AB015233.l|AB015233 [3219158] 
gi|2765275|emb|Y12924.1|SPY12924 [2765275] 
gi|3 1 83 687|emb| Y 1 1 62 1 . 1 |EA 1 6SRRN [3 1 83687] 
gi|2765274|emb|Y12923.1|EFY12923 [2765274] 
gi|2765273|emb|Y12922.1|ESY12922 [2765273] 
gi|2765272|emb|Y12921.1|ESY12921 [2765272] 
gi|2765271|emb|Y12920.1|EDY12920 [2765271] 
gi|2765270|emb|Y12919.1|ESY12919 [2765270] 
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gi|2765269|emb| Y 1 29 1 8. 1 |ECY 1 29 1 8 [2765269] 
gi|2765268|emb|Y 1 29 1 7. 1 |ECY 1 29 1 7 [2765268] 
gi|2765267|emb|Y 1 29 1 6. 1 |EP Y 1 29 1 6 [2765267] 
gi|2765266|emb|Y 129 1 5. 1|ESY12915 [2765266] 
gi|2765265|emb|Y129l4.1|ERYl2914 [2765265] 
gi|2765264|emb|Y12913.1|EMY12913 [2765264] 
gi|2765263|emb|Y12912.1|EHY12912 [2765263] 
gi|2765262|emb|Y12911.1|EMY12911 [2765262] 
gi|2765261|emb|Y12910.1|EGY12910 [2765261] 
gi|2765260|emb|Y12909.1|EDY12909 [2765260] 
gi|2765259|emb|Y12908.1|ECY12908 [2765259] 
gi|2765258|emb|Y12907.1|EAY12907 [2765258] 
gi|2765257|emb|Y 12906. 1 |EFY 1 2906 [2765257] 
gi|2765256|emb|Y 12905. 1|EFY 12905 [2765256] 
gi|289454l|emb|AJ223332.1|EFAJ3332 [2894541] 
gi|2894539|emb|AJ22333 1. 1 IEFAJ333 1 [2894539] 
gi|3108058|gb|AF060881.1|AF060881 [3108058] 
gi|3087776|emb|AJ223633.1|EFAJ3633 [3087776] 
gi|3080754|gb|AF016483. 1|AF016483 [3080754] 
gi|2 1 97 1 1 9|gb| AF003 921.1 (AP00392 1 [2197119] 
gi|29 82722|dbj| ABO 12213.1 1 ABO 12213 [2982722] 
gi|298272 1 |dbj| ABO 122 12. 1 |ABO 1 22 1 2 [298272 1 ] 
gi|2058780|gb|B07890.1|B07890 [2058780] 
gi|2058779|gb|B07889.1|B07889 [2058779] 
gi|2058778|gb|B07888.1|B07888 [2058778] 
gi|2058777|gb|B07887.1|B07887 [2058777] 
gi|2058776|gb|B07886.1|B07886 [2058776] 
gi|2058775|gb|B07885.1|B07885 [2058775] 
gi|2058774|gb|B07884.1|B07884 [2058774] 
gi|2058773|gb|BO7873.1|BO7873 [2058773] 
gi|2058772|gb|B07872.1|B07872 [2058772] 
gi|205877 l|gb|B0787 1 . 1|B0787 1 [205877 1] 
gi|2058770|gb|B07870.1|B07870 [2058770] 
gi|2058769|gb|B07869. 1 |B07869 [2058769] 
gi|2058768|gb|B07868.1|B07868 [2058768] 
gi|2058767|gb|B07867. 1 |B07867 [2058767] 
gi|2058766|gb|B07866. 1 |B07866 [2058766] 
gi|2058765|gb|B07865.1|B07865 [2058765] 
gi|2058764|gb|B07864.1|B07864 [2058764] 
gi|2058763|gb|B07883.1|B07883 [2058763] 



gi|2058762|gb|B07882. 1|B07882 [2058762] 
gi|2058761 |gb|B0788 1 . 1|B0788 1 [2058761] 
gi|2058760|gb|B07880.1|B07880 [2058760] 
gi|2058759|gb|B07879.1|B07879 [2058759] 
gi|2058758|gb|B07878.1|B07878 [2058758] 
gi|2058757|gb|B07877.1|B07877 [2058757] 
gi|2058756|gb|B07876.1|B07876 [2058756] 
gi|2058755|gb|B07875.1|B07875 [2058755] 
gi|2058754|gb|B07874.1|B07874 [2058754] 
gi|2058753|gb|B07863.1|B07863 [2058753] 
gi|2058752|gb|B07862. 1 |B07862 [2058752] 
gi|205875 1 |gb|B07861 . 1 |B07861 [205875 1] 
gi|2058750|gb|B07860. 1 |B07860 [2058750] 
gi|2058749|gb|B07859.1|B07859 [2058749] 
gi|2058748|gb|B07858.1|B07858 [2058748] 
gi|2058747|gb|B07857.1|B07857 [2058747] 
gi|2O58746|gb|B07856.1|BO7856 [2058746] 
gi|2058745|gb|B07855.1|B07855 [2058745] 
gi|2058744|gb|B07854.1|B07854 [2058744] 
gi|2O58743|gb|B07853.1|BO7853 [2058743] 
gi|2058742|gb|B07852.1|B07852 [2058742] 
gi|2058741|gb|B07851.1|B07851 [2058741] 
gi|2058740|gb|B07850.1|B07850 [2058740] 
gi|2947527|gb|T25933.1|T25933 [2947527] 
gi|2924302|emb|X81655.l|EHERMAM [2924302] 
gi|26642561emb|Y12234.1|EFAS48C [2664256] 
gi|2879906|dbj|D85752.1|D85752 [2879906] 

gi|27462 1 6{gb| AF028836. 1 [AF028836 [2746216] 
g i|2745825|gblAF039139.1|AF039139 [2745825] 
gi|2696019|dbj|AB007844.1|AB007844 [2696019] 
gi|48999|emb|X62280. 1|EHPBP5G [48999] 
gi|2654477|gb|U899 14. 1 |BFU89914 [2654477] 
gi|43347|emb|X68646.1|EHPSRAA [43347] 
gi|2613034|gb|AH005624.1|SEG_EDDH4RR 
[2613034] 

gi|2613033|gb|AF029775.1|EDDH4RR2 [2613033] 
gi|26l3032|gb|AF029774.1|EDDH4RRl [2613032] 

gi|26l3031|gb|AH005623.1|SEG_EDDHIRR 
[2613031] 

gi|2613030|gb|AF029773.MEDDHIRR2 [2613030] 
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gi|2613029|gb|AF029772.1|EDDHIRRl [2613029] 

gi|2613028|gb|AH005622.1|SEG_EDH19RR 
[2613028] 

gi|26 1 3027|gb| AF02977 1 . 1|EDH1 9RR2 [26 1 3027] 
gi|2613026|gb|AF029770.1|EDH19RRl [2613026] 

gi|26 1 3025|gb|AH005 62 1 . 1 |SEG_EDISRR 
[2613025] 

gi|261 3024|gb|AF029769. 1 |EDISRR2 [261 3024] 
gi|2613023|gb|AF029768.1|EDISRRl [2613023] 
gi| 1 88 1 226|dbj|AB00 1488. 1 |AB00 1488 [ 1 88 1 226] 
gi|2547 1 60|gb|AF023 1 04. 1 1 AF023 104 [2547 1 60] 
gi|2547 1 59|gb|AF023 1 03. 1 |AF023 103 [2547 1 59] 
gi|2547 1 58|gb|AF023 1 02. 1 1 AF023 102 [2547 158] 
gi|2547 1 57|gb|AF023 101.1 |AF023 1 0 1 [2547 1 57] 
gi|2415383|gb|AF015775.1|AF015775 [2415383] 
gi|2388636|gb|U94356.1|EFU94356 [2388636] 
gi|2388634|gb|U94355.1|ECU94355 [2388634] 
gi|2340825|dbj|D26045.1|D26045 [2340825] 
gi|2226 147|emb|Y 14080. 1 |BSY 14080 [2226147] 
gi|2327026|gb|U87997.1|EFU87997 [2327026] 
gi|2318058|gb|AF012532.1|AFO12532 [2318058] 
gi| 1 848 175|emb|X87 189. 1 |EM23S5SSP [1 848175] 
gi|1848174|emb|X87187.1|EM16S23SS [1848174] 
gi| 1 848 173|emb|X87 188. 1 |EM 1 6S23SP [1848 173] 
gi| 1848 172|emb|X87 185.1|EH23S5SSP [1 848172] 
gi|l848171|emb|X87184.1|EH16S23SS [1848171] 

gi| 1 848 1 70|cmb|X87 181.1 |EF23S5SSP [1 848 1 70] 
gi|1848169|emb|X87183.1|EF23S5SPA [1848169] 
gi|1848168|emb|X87191.1|EF23S5SAC [1848168] 
gi| 1 848 1 67|emb|X87 1 80. 1 |EF 1 6S23SS [ 1 848 167] 
gi|1848166|emb|X87182.1|EF16S23SP [1848166] 

gi| 1 848 1 65|emb|X87 1 90. 1 |EF 1 6S23SC [ 1 848 1 65] 
gi|1848164|emb|X87186.1|EF16S23SA [1848164] 
gi|1848156|emb|X87179.1|ED23S5SSP [1848156] 

gi| 1 848 1 55|emb|X87 1 78. 1 |ED 1 6S23SS [ 1 8481 55] 
gi|1848154|emb|X87177.1|ED16S23SA [1848154] 
gi|2274942|emb|AJ000346.1|EHNAPBC [2274942] 

gi|2274939|emb|AJ000042.1|EFGLS24B 
[2274939] 

gi|414575|gb|L12710.1|ENEAAC [414575] 
gi|2245603|gb|AF006008. 1|AF006008 [2245603] 



gi|223 1 992|gb|U94530. 1 |EFU94530 [223 1 992] 
gi|223 1990|gb|U94529. 1|EFU94529 [223 1 990] 
gi|223 1988|gb|U94528. 1 |EFU94528 [223 1 988] 
gi|2231986|gb|U94527.1|EFU94527 [2231986] 
gi|2231984|gb|U94526.1|EFU94526 [2231984] 
gi|2231982|gb|U94525.1|ECU94525 [2231982] 
gi|2231980|gb|U94524.1|ECU94524 [2231980] 
gi|2231978|gb|U94523.1|ECU94523 [2231978] 
gi|2231976|gb|U94522.1|ECU94522 [2231976] 
gi|2231974|gb|U94521.1|ECU94521 [2231974] 
gi|2196685|gb|U25090.1|EFU25090 [2196685] 
gi|2197120|gb|AF003922.1|AF003922 [2197120] 
gi|2196683|gb|U25095.1|EFU25095 [2196683] 
gi|2196681|gb|U25094.1|EFU25094 [2196681] 
gi|2 1 96679|gb|U25093. 1 |EFU25093 [2 196679] 
gi|2 1 96677|gb|U25092. 1 |EFU25092 [2 1 96677] 
gi|2196675|gb|U25091.1|EFU25091 [2196675] 
gi|2 l96673|gb|U24682. 1 |EFU24682 [2 1 96673] 
gi|532533|gb|U09422.1|EFU09422 [532533] 
gi|487271|dbj|D17462.1|ENENTP [487271] 
gi|468459|dbj|D28859.1|ENEPPDl [468459] 
gi|440135|dbj|D16334.1|ENEATPK [440135] 
gi|391680|dbj|D13816.1|ENENAABS [391680] 
gi|1402524|dbj|D78257.1|D78257 [1402524] 
gi|709995|dbj|D308O8.1|BACYCB20 [709995] 
gi|2 1 09265|gb|U9 1 527. 1 |EFU9 1 527 [2 109265] 
gi|104U12|dbj|D78016.1|ENEPPDlA [1041112] 
gi|1339880|dbj|D85392.1|ENERPA [1339880] 
gi|1339878|dbj|D85393.1|ENEGElE [1339878] 
gi|662918|emb|Z46807.1|EHCOPA'YZ [662918] 
gi|769796|emb|X86176.1|EFRPODDNE [769796] 

gi| 1 8 5463 8|gb|U5 1479. 1 |EGU5 1479 [1854638] 
gi| 1 85722 l|gb|U72706. 1 |EFU72706 [1 85722 1] 
gi| 1 8572 1 9|gb|U72704. 1 |EFU72704 [ 1 8572 1 9] 

gi| 1 857 2 1 7|gb|U72705 . 1 |ECU72705 [1857217] 
gi|1272655|emb|X96978.1|EFPPDlGNS [1272655] 
gi| 1272652|emb|X96976. 1|EFPLSEP 1G [1272652] 

gi|1279406|emb|X96977.1|EFPADlORF 
[1279406] 

gi|1070149|emb|X9321 1.1|EFTNF01 [1070149] 
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gi|1065723|emb|X92947.1|EFTETMGN [1065723] 

gi|1019639|gb|L38972.1|PH4COINJN [1019639] 

gi| 1 1 5 1 1 5 1 |gb|U43 087. 1 |EFU43087 [1151151] 

gi|1098507|gb|U17283.1|BMUl7283 [1098507] 

gi|l498072|gb|U64887.1|EFU64887 [1498072] 

gi| 149807 1 |gb|U64886.1|EFU64886 [1498071] 

gi|1469783|gb|U58049.1|EHU58049 [1469783] 

gi|1763666|gb|U81452.1|EFU81452 [1763666] 

gi|624694|gb|L38973. 1 |PH4SEQ [624694] 

gi| 1730458|emb|Z83305. 1 |EFVANRES [ 1 730458] 

gi| 1 41 9498|emb|X84796. 1 |ECPF W4 [141 9498] 

gi|1419497|emb|X84795.1|ECPFW3 [1419497] 

gi| 1 4 1 9496|emb|X84794. 1 |ECPFW 1 [1 4 1 9496] 

gi|254400|gb|S43266.1|S43266 [254400] 

gi|239025|gb|S66277. 1 |S66277 [239025] 

gi| 105493 1 |gb|U38590. 1]EFU38590 [105493 1] 

gi| 1 244573|gb|U39788. 1 |EHU39788 [ 1 244573] 

gi| 1 24457 1 |gb|U39789. 1 |EGU39789 [ 1 24457 1 ] 

gi| 1 244569|gb|U39790. 1 |EFU39790 [1244569] 

gi|1255020|gb|U39777.1|ESU39777 [1255020] 

gi| 12550 18|gb|U39775. 1|EPU39775 [125501 8] 

gi| 1 2 5 50 1 6|gb|U39778 . 1 |EDU39778 [ 1 25 50 1 6] 

gi|1255014|gb|U39776.1|ECU39776 [1255014] 

gi|1255012|gb|U39774.1|EAU39774 [1255012] 

gi| 1 6 1 9922|gb|U69267 . 1 |IVU69267 [ 1 6 1 9922] 

gi|790436|emb|X84861 . 1 |EFEFMPBP5 [790436] 

gi|790434|emb|X84858. 1 |EFD63RPSR [790434] 

gi|790432|emb|X84862.1|EF721PBP5 [790432] 

gi|790430|cmb|X84860. 1 |EF63RPBP5 [790430] 

gi|790428|emb|X84859.1|EF366PBP5 [790428] 

gi|157280O|gb|U7O854.1|CELF38A5 [1572800] 

gi|l041816|gb|U17153.1|EFUl7153 [1041816] 

gi|1086523|gb|U39859.1|EFU39859 [1086523] 

gi|403 5 64|gb|U0 1 9 1 7. 1 1 EFUO 1917 [403564] 

gi| 1 5 1 5474|gb|U66286. 1 |EFU66286 [151 5474] 

gi| 15 13068|gb|Ul 5554. 1|LMU1 5554 [15 13068] 

gi| 1 2965 20|emb|X94 1 8 1 . 1 1 EFENTAORF 
[1296520] 

gi|1488069|gb|U63997.1|EFU63997 [1488069] 
gi|1209525|gb|U35369.1|EFU35369 [1209525] 



gi| 146934 l|gb|U3093 1 . 1|ESU3093 1 [ 146934 1] 
gi|488331|gb|M77276.1|SYNGIP2122 [488331] 
gi| 1 046 1 77|gb|U39733. 1| [1046177] 
gi|1236613|gb|U49939.l|CVU49939 [1236613] 

gi|4749 1 |emb|X55766. 1 |SS 16SR5G [4749 1 ] 
gi|47490|emb|X55767. 1|SS16SR3G [47490] 
gi|4706 1 |emb|X56353. 1 |SFTET9 16 [4706 1 ] 
gi|49022|erab|X62755.1|SFNPRG [49022] 
gi|47047|emb|X17214.1|SFPASAl [47047] 
gi|47044|emb|X68847.1|SFNOXAA [47044] 
gi|47033|emb|V0 1 547. 1|SFKANR [47033] 
gi|4701 8|cmb|X02027. 1|SF5SRNA [4701 8] 
. gi|5 1 1044|emb|X75752. 1 |MP 1 6SRNAO [5 1 1044] 
gi|5 1 1043|emb|X7575 1 . 1 |MP 1 6SR243 [5 1 1043] 
gi|886481|emb|X82819.1|ESPLPAM [886481] 
gi|5 1 73 87|cmb|X76 1 77. 1 |ES 1 6SRR [5 1 7387] 
gi|4729 1 6|emb|X769 1 3^ 1 |EHNTPOP [4729 1 6] 
gi|4335 l|emb|X55 1 33; 1|ES 16SRRN [4335 1] 
gi|l 143442|emb|X92687.1|EFPBP5G [1 143442] 
gi|963032|emb|Z50854.1|EHARPQTOU [963032] 
gi|886479|emb|X848 1 8.1 |EHDNAPSR [886479] 
gi|55 1437|emb|X8 1 654. 1 |EHIS 1216 [55 1437] 
gi|467805|emb|X78425.1|EFPBP5 [467805] 
gi|29672 1 |emb|X5596 1 . 1 |EFPD78 [29672 1] 
gi|287946|emb|Z19137.1|EFPTSHGN [287946] 
gi|49042|emb|X63285.t|EHNAKA [49042] 
gi|490l9|emb|X62658.1|EFSEAl [49019] 
gi|43337|emb|Z12296.1|EFSPREG [43337] 
gi|43335|emb|X56895.1|EFPVANAG [43335] 
gi|43333|emb|X16421.1|EFPF54 [43333] 
gi|43331|emb|X62657.1|EFORF3 [43331] 
gi| 106572 1 |emb|X92945. 1 |EFC AT50 1 [ 1 06572 1 ] 
gi|806551|emb|Z49243.1|EF4110SOP [806551] 
gi|806549|emb|Z49244. 1 |EF4 1 05SOD [806549] 
gi|505530|emb|X79542.1|EFAS48 [505530] 
gi|43323|emb|X62656.1|EFASPl [43323] 
gi|40840|cmb|X56422.1|EC16SRNAG [40840] 
gi|48189|emb|X04388.1|TN1545TR [48189] 
gi|928814|gb|L40841.1|ENETRANSPO [928814] 
gi| 1 41 856|gb|L0 1 794. 1 |AD 1 REP ABC [141856] 
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gi| 149 125|gb|M90647. 1 |IP8 VANY [ 149 1 25] 
gi| 14 1 862|gb|M87836. 1 |AD 1TRAE1 [141 862] 
gi| 1 4 1 860|gb|M84374. 1 |AD 1TRAA [ 14 1 860] 
gi| 14 1 853|gb|M62888. 1 |AD 1 PAD 1 [14 1853] 
gi| 1 1 0 1 637 |dbj| D3 1 674. 1 |E VM 1 6RNA7 [1101 637] 
gi|l 101636|dbj|D3 1675. 1|ENE16RNA8 [1101636] 
gi|497792|dbj|D3 1 676. 1 |ENC 1 6RNA9 [497792] 
gi| 1022729|gb|U36 1 95. 1 |EFU36 1 95 [1022729] 
gi|488338|gb|M77279. 1 |S YNGIP3 124 [488338] 
gi|488335|gb|M77278.1|SYNGIP2563 [488335] 
gi|488333|gb|M77277. 1 |SYNGIP2 124 [488333] 
gi|488329|gb|M77275. 1 |SYNGIP2 121 [488329] 
gi|388267|gb|Ll9532.1|ADlTRAC [388267] 
gi|4930 1 6|gb|U03756. 1 |EFU03756 [4930 16] 
gi|453536|gb|L28754.1|INSTRAN [453536] 
gi|l 53658|gb|M58002. 1 |STRHYDROLA [1 53658]. 
gi|475427|gb|U0068 1 . 1 |EFU0068 1 [475427] 
gi|8 1 8704|gb|U24692.1 |EFU24692 [8 18704] 
gi|155036|gb|M97297.1|TRNVAN [155036] 
gi| 1 50552|gb|M64978. 1 |PCFPRGAB [150552] 
gi|786274|gb|U2254 1 .1 |EHU2254 1 [786274] 
gi|786273|gb|U22540.1|EHU22540 [786273] 
gi|559858|gb|L371 10.1|AD1CLYL [559858] 
gi|6436 1 4|gb|U 1 6659. 1 |ECU1 6659 [6436 14] 
gi|6436 1 2|gb|U 1 6658. 1 [ECU 1 6658 [64361 2] 
gi|290641|gb|L13292.1|ENECOPPUMP [290641] 
gi|624701|gb|L29639.1|ENEVANCRF [624701] 
gi|624699|gb|L29638.1|ENEVANCR [624699] 
gi|624692|gb|L29641.1|ENEDDLA [624692] 
gi|624690|gb|L29640. 1 |ENEDDL [624690] 
gi|493094|gb|L328 13. 1 |ENERRD [493094] 



gi| 1 53852|gb| AH00O93 9. 1 |SEG_STRTN9 1 6 
[153852] 

gi|153851|gb|M22645.1|STRTN9l62[153851] 
gi| 1 5 3850|gb|M20864. 1 |STRTN9 1 6 1 [ 1 53850] 
gi|153660|gb|M36878:i|STRIF2BA [153660] 
gi|153585|gb|M13771.1|STRBRP [153585] 
gi|l 53575|gb|M64265 . 1 |STRATPEFHA [ 1 53 575] 
gi| 1 53565|gb|M90060. 1 |STRATPASEA [ 1 53565] 
gi|152969|gb|M92376.1|STABLAIA [152969] 
gi|309660|gb|L14285. 1 |PCFPRGWZY [309660] 
gi|433714|gb|L12033.1|ENESATA [433714] 
gi|290645|gb|L 1 5304. 1 |ENEVANB2A [290645] 
gi|148331|gb|M84146.1|ENEVANR [148331] 
gi|148329|gb|M64304. 1|ENEVANH [148329] 
gi| 1 48326|gb|M689 1 0. 1 |ENEVANCRES [148326] 
gi|148324|gb|M75132.1|ENEVANC [148324] 
gi|148323|gb|L06138.1|ENEVANB [148323] 
gi|148321|gb|M85225.1|ENETETM [148321] 
gi| 1 48320|gb|L00925. 1 |ENERTRNA [148320] 
gi|148319|gb|L00924.1|ENERRNA [148319] 
gi| 1 483 1 7|gb|M8 1 466. 1 |ENERECA [ 1483 1 7] 
gi|148315|gb|M81961.1|ENENAPA [148315] 
gi| 1483 1 2|gb|M3 8386. 1 |ENEMSPDPS [ 1483 12] 
gi| 1 483 1 0|gb|M37 1 85. 1 |ENEGELE [1483 10] 
gi| 148307|gb|L07892. 1 (ENEBLACREG [148307] 
gi|148305|gb|M60253.1|ENEBELAA [148305] 
gi|148303|gb|M77639.1|ENEB14NAM [148303] 
gi|290644|gb|L16515.1|ENERGTG [290644] 
gi| 1 54954|gb|M37 1 84. 1 |TRN9 1 6 [ 1 54954] 
gi| 1 4830 1 |gb|M6922 1 . 1 |ENEAAD9 A [ 14830 1 ] 
gi|148308|gb|M38052.1|ENECYLB [148308] 
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Table 28 



Phage Dpi complete genome sequence. 56506 nucleotides. 



1 ataataaaaa tatgaagcag atattgggtt aattattgct taacaaaatg caccgaattt gtgtataata 
71 taagtgaagc agttttgtaa acctgacatc ctgctaaata aaaataaagg aggctcgaac atgagtcaaa 
141 acactacacg cactgacgct gaattgacag gcgttactct tttaggaaac caagacacca aatacgatta 
211 tgactataat ccagacgtcc ttgaaacttt ccctaacaaa- catcctgaaa ataattacct agtaacattt 

2 81 gacggatatg aattcacttc cctttgccct aaaacaggac agcctgactt cgcgaatgtt ttcattagtt 

3 51 acattccaaa cgaaaagatg gttgaatcta aatcattgaa attgtactta ttcagtttcc gtaaccacgg 
421 tgacttccac gaagatCgca tgaacattat tttgaatgac ttgtatgaat tgatggaacc taagtacatt 

4 91 gaagtcatgg gcctattcac tcctcgtggt ggaatttcaa tttacccatt cgtcaacaaa gtgaatcctc 
561 aatttgcaac tcctgaactt gaacagcttc aacttcaacg caaattgaac ttccttggaa atgttcaagg 
631 tcttggacga gctattcgat aggaggctgg aatgaaatca gtagttttat tatccggcgg agtcgactca 
701 gccacttgtt tagcaattga agttgacaag tggggttcta aaaatgttca tgctatagca ttcaattacg 
771 gacaaaagca tgaagcagaa cttgaaaatg ctgctaatgt tgcaatgttc tacggagtca agttcaccat 
841 tcttgaaatt gactcgaaaa tctactcaag ctctagctct tccttattac aaggaaaagg cgaaatttca 
911 catggaaaat cttacgctga aatcctagca gagaaggaag tagttgacac ctatgttcca tttagaaatg 
981 gactaatgct ttcacaggct gcggcttatg cttattcggt tggagcttct tacgtcgtat atggtgctca 
1051 cgcagacgat gcggctggag gtgcttaccc tgattgcact cctgagttct ataattcaat gtcaaatgca 
1121 atggaatatg gaactggagg caaggtaacc cttgtcgctc ctctacttac tctaaccaag gcgcaagtcg 
1191 ttaaatgggg aattgattta gatgttcctt atttcttgac tcgttcatgt tatgaaagtg acgctgaaag 
1261 ttgtggaact tgcgcaactt gtatcgaccg caaaaaggca ttcgaagaaa atggaatgac tgaccctatt 
1331 cattataagg agaattgata tgagagtttc taaaacctta acattcgacg cagctcatca actagttgga 
14 01 cattttggaa aatgcgcaaa tttgcacggg catacttaca aagtcgaaat ttcattagca ggcggaactt 
1471 atgaccacgg ttcgagtcaa gggatggttg ttgactttta tcacgtcaag aaaatcgcag gtacattcat 
1541 tgacagactt gaccacgctg ttcttcttca agggaatgaa ccaatcgctt tagcaaatgc agttgacacc 
1611 aagcgagttc tatttggatt tagaactacg gctgagaata tgtcaagatt ccttacctgg actctcacgg 
1681 agcttatgtg gaagcatgct cgtatcgact ctatcaaact atgggaaact cctacaggtt gcgcagaatg 
1751 tacttactac gagattttca cagaagacga gattgaaatg ttcaagaacg taacctttat cgacaaagac 
1821 gaaaagatta ctgtccgcga aattttagag caggagcagg ataatggtta atcaatacaa tcagcctgaa 
1891 agaggcaaga ttcgaatcaa tgttcgcgac cctgagaaaa tgcctatcat ggaaattttc ggtcctacaa 
1961 ttcaaggtga aggaatggtt ataggtcaaa agactatttt cattcgaact ggtggatgcg actatcattg 
2031 caactggtgt gactcagcct ttacctggaa cggtacCact gagccggaat atatcacagg caaagaagct 
2101 gctagtcgaa tcttgaaact agctttcaat gataaaggtg aacagatttg taaccacgtg acattgactg 
2171 gaggaaatcc tgccttaatc aacgagccta tggctaagat gatttcgatt ctaaaagaac atggattcaa 
2241 gtttggtctc gaaactcaag gaactcgatt ccaagaatgg ttcaaagaag taagcgatat cactattagt 
2311 cctaaaccgc cttcaagtgg aatgagaact aatatgaaaa ttcttgaagc tattgtagat agaatgaatg 
2381 atgaaaacct tgactggtca tttaaaatcg ttatctttga cgaaaatgac cfcagcttatg cgcgtgatat 
24S1 gtttaaaact ttcgaaggca agttacgtcc agtgaactac ctttcagttg ggaatgcaaa cgcatacgaa 
2521 gaaggaaaaa tcagtgatag gcttcttgaa aagttgggat ggctttggga taaagtgtat gaagacccag 
2591 ctttcaacaa tgttcgacct ttaccgcaac ttcatacact tgtttatgat aataaaagag gagtataaaa 
2661 tgaaaattga gcatctagat aaaatcggta acgtattagg gagagagaac ggatgggctt cccttaagcc 
2731 ggatgaaatt gtaaccttgg acaatactga ggcagccgtt caaagacttt ttggtctatt aggcgaggac 
2801 gcagaacgtg acgggttgca agatactcca ttccgttttg ttaaagcact cgctgaacat accgtagggt 
2871 atcgagaaga ccctaaactt catctcgaaa aaacattcga cgtcgaccat gaagaccttg ttcttgtgaa 
2941 agacattcca ttcaattctt tatgtgagca tcatttagct ccgttcgtag ggaaggtgca tattgcatac 
3011 attcctaagg ataagattac aggtctttca aaattcggtc gagtggttga aggatacgct aaacgacttc 
3081 aagtacaaga gcgcttgact caacaaatcg ctgacgctat tcaggaagtt ctaaatcctc aagcagttgc 
3151 ggtcatcgta gaggctgagc atacttgcat gagcggacgc ggtattaaga agcacggggc aacgacagtg 
3221 acttcaacta tgcgaggtct tctccaagat gacgcatctg ctcgagcaga attgcttcag ttgattaaaa 
3291 agtaggaggc ggaaaatgaa taaaagtgca accttttggc ttgttcgaac agctcttatt gcggctctat 
3361 atgtgacatt gaccgttgca ttttctgcta ttagttatgg acctattcaa tttagagtca gtgaagcctt 
3431 gattcttcta cctttatgga accatagatg gactccgggg attgtattag gaacaattat tgcaaacttc 
3501 ttttcacctc ttggactgat tgacgtttta ttcggttcac ttgctacctt ccttggagta gtggcaatgg 
3571 tgaaagttgc taagatggca agtcctctat attcacttat ctgtccagtt cttgctaatg cttaccttat 
3641 tgcgctggaa cttcgaatag tttactcttt acctttttgg gaatctgtca tctatgtagg aattagtgaa 
3711 gcgattatcg ttttaatttc atacttcctt attcccacgc tggcgaagaa caatcatttt agaacactga 
3781 taggagcgaa aaatgggatt taatctatac ttcgcaggag gtcacgctat tagcactgac gattatttga 
3851 aggaaagagg agccaatcgc ctattcaatc aactgtacga aagaaacggg attggcaaaa ggtggattga 
3921 gcacaagaaa accaatccaa gcactacttc aaaactattc gtcgactcta gtgcatattc tgctcatacc 
3991 aaaggggctg aagttgacat tgacgcctat atcgaatacg tgaatgataa cgtgggaatg tttgaccgta 
4061 tcgccgaact cgataaaatt cctggtgtat ttagacagcc taagacacgt gaacagcttt tggaagcacc 
4131 acaaatttct tgggataatt atctatacat gcgcgagcga atggttgaga aagacaagct cttacctatt 
4201 ttccatatgg gagaagactt taaatggctc aacttgatgc ccgaaactac attcgaaggc ggaaagcata 
4271 ttccttacat tggaatttca ccagccaatg actcgactac gaagcataaa gacaagtgga tggaaagagt 
4341 attcgaagtt attcgaaaca gttctaatcc agacgttaag actcacgcat ttgggatgac agttactagc 
4411 caattagagc gtcacccatt ctatagcgcc gactctactt ctgtactgct cacaggagcg atgggaaaca 
44 81 ttatgacgtc aaaaggatta gttgacttgt cacagaagaa tggaggaatt gatgctgtcc gcaggctgcc 
4551 aaaaccggtt caagttgaaa ttgaatccat tatcgaagaa actggagcgc attttagcct agagcaatta 
4621 gttgaggact ataaacttcg agcattgttc aatgttcaat acatgctgaa ttgggcagag aactatgaat 
4691 tcaagggaat taaaaatcgt caacgtcgac tattttagat aagagctttt cgctcttatt ttttttaaaa 



368 



4761 aaaaatgaac tttttataca aaaacgcttg actttattca ctcattatcg tataatcata atataaataa 

4831 aacgaataag aggtaaataa aatgacagca gttcaacaag ttaagttcta cttagaagaa gccggcgctc 

4 901 actttctaaa agatgttgag tacagtgaca acttagagca agcaattatg aaagatattc ttaaatggaa 

4 971 tggcgctcat agagatgagc acgatatgaa aataacttca tacgaagt'at tatagagagg ggtaaggcta 

5041 tgaaaaaagt tcaaacttat caagaatatc taaaactagt tgagttcaaa cgtcaacttt ctttaaatct 

5111 tcgagaagga aaaataggag tcgatgaagc ggttattcaa ttattcacct tctatagttt caacaatatc 

5181 gaggaacctc ctttcattgt actcaaaatg caagaggctg ccgtgaacgg gacttatgaa gcaaaactca 

5251 atatgcttaa aagatttaaa attatttaga aacggcttta caaactcgcg ataattcgtg tatattatat 

5321 atatcaaaaa aaggaggctc atattatgag tattaagttc aaaaccgaag aactttcaaa aattgtttct 

5391 cagctcaata agttgaagcc tagcaagttg ctagaaatca caaactattg gcatattttt ggtgacggcg 

5461 aatgcgtcat gtttacagcg tatgatggct caaacttcct tcgatgcatt atcgacagcg atgttgaaat 

5531 tgacgtgatt gtgaaagcag agcagtttgg aaaacttgta.gaaaagacca cggccgcaac cgtcacatta 

5601 gttcctgaag aatcttcgct aaaagttatt gggaatggtg agtacaatat tgatattgtt acagaagatg 

5671 aagagtaccc tacattcgac cacttgctcg aagacgtgag tgaagaaaat gctctcactt tgaaaagctc 

5741 gctgttctac ggaatcgcca atatcaacga ttctgcggta tctaaatcag gagcagatgg aatttatacc 

5811 ggcttcctgt taaaaggcgg aaaagcaatt actacagaca tcattcgcgt atgtatcaac cctatcaagg 

5881 aaaagggact agaaatgctc attccttaca acctaatgag tattttagca agtattcctg atgagaagat 

5951 gtacttctgg caaattgacg atactactgt ctatatttca tcggcttcag tcgaaattta tggaaaattg 

6021 atggaaggta tggaagatta tgaagacgtt tcacagcttg actcaattga gtttgaagat gatgcggcta 

6091 tccctacagc agaaatcctg agcgtattag accgccttgt actattcact tcagcctttg acaaaggaac 

6161 cgtcgaattc ttattcttga aagaccgact tcgaattaaa acttctacta gcagttatga agacatcatg 

6231 tacgcatctg ctggcaagaa agtttcgaag aaagaattca cttgccacct taacagctta ctcttgaagg 

6301 aaattgtatc aaccgtcacc gaagaaaacC tcactgtctc ttatggaagc gaaaccgcaa ttaagatttc 

6371 atcgaatggt gtcgtttact tcctagcact tcaagagccg gaagaataat ggccaagtcc aatttaacta 

6441 gaattgcaaa gatggttaga gcaggaaaca gtgaaggtcc tgcttcatct tttgccaatt cgctgacccg 

6511 ggttattgaa cgaactcagc ctgaatataa tccttcgaca tattataagc ccagcggggt tggtggatgt 

6581 attcgaaaaa tgtatttcga aagaatcggt gagtctatta tagataacgc agattctaac ctaattgcaa 

6651 tgggcgaagc tggaacattt aggcacgaag ttctccaaga gtacatggtt aaaatggctg aaatcgatga 

6721 ggactttgaa tggttgaatg tagcagagtt cttgaaagaa aatccagttg aaggaactat cgtcgacgag 

67 91 cgtttcaaga aaaacgatta tgaaacgaag tgtaagaacg aacttcttca actttcattc ttgtgtgacg 

6861 gactagttcg atataaaggc aagctctaca ttttagagat taagactgaa accatgttca agttcactaa 

6931 acatactgag ccctatgaag aacacaagat gcaagcaact tgctacggaa tgtgtctagg agtcgatgat 

7001 gtcattttcc tttatgaaaa tcgagataac ttcgaaaaga aagcctacac gtttcacatc acagacgaga 

7071 tgaaaaatca agtccttgga aaaattatga cctgcgaaga gtatgtagag aaaggcgaaa gtcctaaaat 

7141 ctattgctct tcagcctatt gcccatattg tagaaaggaa ggtcgaaatc tgtgagctat actggaaaaa 

7211 tgttcgagga agactttttc gaaggtgcaa aagactttga gaaagatgct ttcacggtcc gtctatatga 

7281 taccactaat ggatttcgag gagttgcaaa tccctgcgat tatatagccg caactaactt tgggaccttg 

7351 tttattgaac tgaaaactac taaagaagct tctttgagct ttaataacat cactgataat caatggttcc 

7421 agctatcacg cgcagatgga tgcaaattta ttctcgccgg aattttagtg tatctccaaa agcatgaaaa 

74 91 gattatatgg tatccaattt caagccttga aaaaattaaa cggtctggag ttaaaagcgt caacccaaac 

7561 ttcatcgatg cagggtatga agtttcttac aagaagcgtc gaactagatt gaccattcct ttccaaaatg 

7631 ttctagatgc agttgagctt cattacaagg agaaaagcaa tggcaagacc taagttacct caaattgata 

7701 ttcgagaaga agaaatacga gatgctcaag acgtagcaga ctcgtatggt gcgattatca ataaagtagt 

7771 cgacgaaatt gttgaagcag cttgcggttc acttgaccag gcaatggaag aaattcaaat agttgtaagc 

7841 caaaatcctg tcattatgga agaccttaac tactacattg gctatcttcc cactcttctt tatttcgccg 

7911 cagatagggc ggaaatggtg ggaatacaaa tggattcaag ttccgctatc aggaaagaaa aatacgataa 

7981 tctatacatt ttagccgccg ggaaaactat tcctgacaag caagcagaaa ctcgaaaact tgtcatgaat 

8051 gaagaagtca tcgaaaatgc ttacaagcga gcctacaaga aagttcaatt aaagctagaa caggccgata 

8121 aggtattagc atctttaaaa cgaattcaaa cctggcaact agcagagtta gaaactcagt caaataattc 

8191 aaaaggagta ttattaaatg caaaaagacg tagacgtgaa aatgattgac cctaaacttg accgattaaa 

8261 atacacaggt gattgggttg atgtacgaat tagttctatc actaaaattg acgccgacag cgccgatgtc 

8331 tcaagatgtc gaaaagtgct tcaaaaggct caagtatatt cagtggcggc aggtgaatgc attaaaattg 

84 01 cacacggatt tgctcttgaa cttcctaagg gatatgaagc aatcttgcat cctcgttcca gtctttttaa 

84 71 gaaaactggt ctaatcttcg tttctagcgg agtgattgac gaaggttaca aaggtgacac tgatgaatgg 
8541 ttctcagttt ggtatgctac tcgtgacgca gatatcttct acgaccaaag aattgcccaa tttagaattc 

8611 aggaaaagca acctgctatc aagttcaatt tcgtagaatc tttaggaaat gcggctcgtg gaggccatgg 

8681 aagtacaggt gatttctaat gaaattggaa cagttgatga aggactggaa taaggattcg aaagctcttg 

8751 tagcagttca aggacttgaa cgtgaagcgc ttccaagaat ccctttttct gcgccttcta tgaattatca. 

8821 aacctacggc gggctccctc gaaaaagggt agttgaattc ttcggtcctg agtcaagtgg gaaaactact 

8891 tcagctctcg acattgtcaa gaatgcgcaa atggtatttg agcaggaatg ggaacagaag actgaagaac 

8961 tcaaggaaaa gctggaaaat gcgcgtgcat ccaaagctag caagactgct gtcaaggaac ttgaaatgca 

9031 actcgatagt cttcaagagc ctcttaagat tgtatatctt gaccttgaga atacattaga cactgagtgg 

9101 gctaaaaaga ttggagtcga tgttgacaat atttggatag ttcgccctga aatgaacagc gctgaagaaa 

9171 tacttcaata tgttttagac attttcgaaa caggtgaagt tggcctagta gttctagatt ccttgcctta 

9241 catggtcagt caaaacctta ttgatgaaga gttgactaaa aaggcctatg. caggaatctc agcgcctttg 

9311 actgaattta gtcgaaaggt tactcctctt cttactcgct acaatgcaat attcctaggc atcaatcaaa 

93 81 ttcgagaaga tatgaatagt cagtacaatg cctattcaac tccaggcgga aagatgtgga agcatgcttg 

9451 tgcagttcga cttaaattta gaaaaggtga ctaccttgac gaaaacggtg catcattgac ccgtactgct 

9521 cgaaaccctg cagggaatgt agtagagtca ttcgtcgaga agaccaaagc atttaagccg gacagaaaat 

9591 tagtttccta tacgctttcc tatcatgatg gaattcaaat tgaaaatgac cttgtagatg tcgctgtcga 

9661 atttggagtc attcaaaagg caggggcatg gttcagtatc gtcgaccttg aaactggaga aattatgaca 

9731 gatgaagacg aagaaccatt gaagttccaa ggcaaggcaa atctagttcg acgcttcaag gaggatgact 

9801 acttattcga catggtgatg actgcggttc acgaaattat cactcgagaa gaaggctaat gcaaaaatct 

9871 ctatttggac ctaagctagt gcctgctagt tcaaggcgca agaaaagaac ggttccaaaa cctaaaccta 

9941 aaatcgatga gcaagtggtt gagcttatga accgcagaga gcgtcaagtg cttgttcata gttgcatcta 

10011 ttattacttt aatgactcaa ttatagcaga cgggcagtat gacaaatgga gccacgaact atattctctt 
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10081 

10151 

10221 

10291 

10361 

10431 

10501 

10571 

10641 

10711 

10781 

10851 

10921 

10991 

11061 

11131 

11201 

11271 

11341 

11411 

11481 

11551 

11621 

11691 

11761 

11831 

11901 

11971 

12041 

12111 

12181 

12251 

12321 

12391 

12461 

12531 

12601 

12671 

12741 

12811 

12881 

12951 

13021 

13091 

13161 

13231 

13301 

13371 

13441 

13511 

13581 

13651 

13721 

13791 

13861 

13931 

14001 

14071 

14141 

14211 

14281 

14351 

14421 

14491 

14561 

14631 

14701 

14771 

14841 

14911 

14981 

15051 

15121 

15191 

15261 

15331 



atagtttcgc 
ctggaatggg 
ttagcttcta 
tgaatcaatt 
cactactgct 
tctaataatg 
tcaaagttta 
agaagagccc 
ctcagtcgag 
ttatcgaaag 
acttgcaaat 
gacatggaag 
ttgccaacta 
attagtgact 
atcactcaac 



tattgtggat 
aattgaaacc 
catttcgaaa 
ttaatccgtt 
gaaacatcat 
aatgtcagct 
gttgatagca 
ctaatgaaga 
gattgtagac 
gagctatttg 
ttactaattg 
tttaaattgg 
gaccttttag 
gcgtgaacga 
ccaataatct 
aatgggaaat 
tctaatcatc 
ttccggatgt 
ggcctttcct 
aaatactcga 
ttgacaatga 
gcacaagacc 
atgaaagtga 
ttaataacgc 
aatcaataag 
caagctatcg 
ttttttcact 
tgacagaagt 
atatttaaaa 
tgacagactt 
tatggatcgg 
gaaggtggac 
gcaaaggctg 
agttggtcag 
tatgaatacg 
ttcaactcac 
aaatttgaat 
gccgcaactt 
ttgaagaagt 
tgaagaggct 
gtagaagagc 
tcgaagaggt 
tggatatgtt 
gagcctgacg 
aagaagactt 
atacgacgaa 
gttgcaaaac 
aaatgtgtga 
cgacgcctca 
aaagaccgtg 
agagactttg 
aaaggctgaa 
ttgcaattag 
cggcactcat 
caggggtata 
ccaatggtga 
tgaaactatt 
aacttcgcga 
agaactcgaa 
cgaaatgtct 
actatacatg 



accctgatga 
tcttccatac 
aataccgtcc 
acaaaatggc 
cgaattttcg 
gggtagaaaa 
catcattgac 
tcatcgggaa 
ttcaacggtt 
tgaaaatgaa 
ggaggaatgc 
ccgtttctaa 
tgacggctca 
cgaaacttta 
ttcctgctca 
gctagaagaa 
aaacttcttt 



tggaaacaac 
atattgcttc 
tcaggatgca 
cttaactcgc 
tcaataatgc 
gaaaatgcag 
tattgcaatc 
aaaaggttac 
gctcaaattt 
tcgacagttg 
tgagggaagc 
atttatcagg 
aaagccgttc 
gtagttcgag 
gaatattcgc 
tagatatggg 
gataattgtg 
ctattgatag 
attggacaag 
gaaattgaca 
ctgaactttt 
ttgtcttgtg 
attgtctata 
agggcataaa 
tacttaacaa 
tgcggtaaat 
aggaagtacg 
taaaaaacgc 
ctcgaaaatg 
ttgtcgagca 
ggaagacatt 
tatcgtgaaa 
accctgagca 
gccagttgaa 
ggatgtggag 
atgtagtcga 
agttgaagaa 
gaagaaaaac 
ctaaagaaga 
agaaagcgca 
cgagatgtct 
atgacagcga 
cttctacgaa 
gaaacttggg 
ctactcgaaa 
aaattgtcaa 
ttcacttaca 
acagcctttt 
tattgcaaat 
gatttaaagg 
ttgaatcagg 
ttagaagtag 
ttgatacccc 
cgctattgct 
aaatacgagg 
agactatcaa 
aaccttgaag 
tgaggctatg 
ggttcaccaa 



gtttcgacag 
gactgtcagt 
tcaaactttc 
gctatcaaac 
cgaaggatgt 
tgttcgaaac 
gaggttcata 
ccgtgttcat 
tgactttact 
gaaggagctg 
gtgacagtat 
tgcactagga 
aagtgtttag 
cagacttcct 
ttttgaaagt 
atgaatgaac 
tgatgagcaa 
ttccaaaata 



gaaatttgac 
cagactattt 
ttttgaagat 
Cttacctacg 
tttgtcaagt 
ttgccagcaa 
aacattttat 
aaggaaactg 
tcatcaggaa 
atctaggtgt 
agggtcaaac 
tatatcttgt 
aaacttcggt 
tgttcgagat 
acacttgttt 
ttgagtttga 
cgacatgatt 
ctgtcgcgat 
ttttcagcct 
agccaaagga 
ctaggagccg 
actttcaata 
gaatggtcgc 
ataagctgaa 
agcccgcaaa 
gaacagaaac 
ttcaagaaag 
ataccaattt 
ctcattaaac 
tacccaatgg 
ctgaaaaatg 
acttacaatg 
gctcaagcaa 
cagccttcga 
aaatgaaaac 
aaacctaaga 
caaaagctgg 
gcctaagaaa 
gacgagccga 
actacttcta 
cattcttgta 
cttgacggca 
aacctatcac 
aactccagcg 
aacgaaacat 
aggagattcg 
agtcgctaca 
tctcgattgg 
acgttatctt 
agcattataa 
cagctttgtt 
tgacctttat 
actgacattt 
agcctattgc 
cgtgcaagag 
cctttgtggg 
tgtattagat 
cttcgagaca 



actgttctct 
ttgctgtaag 
gaggaagtgg 
acggctatct 
gaacaaagga 
attattgaag 
tgctttcaac 
tctatgtact 
cgaattgata 
gttatagtta 
cacaaggctc 
gttccggact 
aaattgtaaa 
tttagaggtt 
aagctagagc 
ttgctggagt 
ggaggagtga 
tataatcgtc 
gctgattcta 
tcaaggcgag 
agcggaagag 
cttgcaagta 
cctacaagaa 
tcttcaaatg 
gacttaatat 
atgaaggaaa 
gcactatgta 
ttgcgaaagg 
aagttgagtg 
acggcgaaga 
ttcaacagtc 
gataaggagt 
tgatggttac 
gaaaatgact 
gacatggtta 
tgaaaaaggt 
agttgatgat 
gaaagtccta 
atgagcctaa 
cgagctggac 
tatacagaaa 
atctgtgtat 
aggtgagagt 
ttccatcagt 
cagtaacaga 
cttctcaagt 
gtgttcaatc 
aaacagttgc 
gcgcaagaac 
ggacatggtg 
ttttctggca 
aactaatcca 
ttcgaatact 
gttcaactcg 
aatcactcga 
gcatcttcta 
aagttgaaga 
cagtgaagtc 
gacgaagaag 
aggttcacaa 
tgaagcagaa 
ccttctcgtc 
tcaatactag 
cgacaccgca 
gttatggctc 
aagcatttcg 
aggtcttatc 
tggaacgaat 
cgataccgtt 
aatcaaagga 
tactacgact 
atgaacaatc 
gtcgaataaa 
atacattgac 
gaccttccag 
aagcagttga 



ataacgagtt 
ggtcgcagaa 
tagctcaaga 
attctgtggt 
cttggctctc 
attctagata 
cggagcattt 
actgaccctc 
atgacgacat 
tgagcgtgac 
gaaaaagtcc 
-acgaaacatt 
tgacttccac 
tgtaagtatt 
aattctgtga 
tgttaaatgg 
catgattgga 
gaaggtgaag 
ttgtagtagg 
aatctacgtg 
ccacctttaa 



gagcaaaagt 
ggtagatact 
cttgaagaca 
gggaggcaag 
aattgagcct 
gaaatgtctt 
tatctaaaaa 
atttagtatc 
aattggtctt 
tggaaaaccc 
ttctgtctaa 
taaaattgac 
gacgcgcagt 
tccagttctg 
tgacgcatca 
gtattggaat 
ttggattgct 
agaagccaat 
tcagcctttg 
gttcagtggt 
attacagtat 
agttatggtc 
tatattatag 
aacaatcaat 
ccagcaagca 
aactactttt 
aatcgtagca 
agcgacggtg 
caaaatctaa 



tatgggagcc 
cttgcattct 
ctcaaggtcc 
taagaaacct 
cgtcgcaaac 
aaattcgaat 
agcagaggac 
gctgacgttt 
agtacatgga 
attagcaaaa 
tacatcaagc 
gccctcgccc 
aattttcaat 
gcagctatta 
ttcccgtttc 
tgaagctgct 
gacgttgaca 
aaagacgcta 
gatgattatg 
gcattagaat 
agatgatatt 
agcgaaagca 
cttccttctt 
aatctagtcg 
tcactaatgc 
aacacttgaa 



taaacagttt 
aggcttttaa 
atatgtcaaa 
ggcgctggaa 
ctattgaaat 
caagtctatg 
aatgcgctgt 
aaaagattcc 
cgttaatcaa 
gccctttcgt 
ttgattatag 
cgcttcactt 
tactcaggaa 
ggctagttcg 
ggcttttcaa 
gagcctaatg 
cagggacttg 
taggttcagg 
aacgagtgta 
atagacggaa 
actgtcatat 
tctaaccatg 
tcaggaattg 
tattagaata 
tgctagcaat 
aaacttttcc 
tcgaagaact 
gggctcaaat 
atctcaaaaa 
atgaatgttt 
tcactcaaaa 
tgagtcgagg 
aagcgaagca 
tgaaaaggca 
tctaaacgat 
gtagttgaat 
ataggccgga 
taccttgctt 
ctaggcatta 
aaggcatggc 
ctatatttct 
aagcaaagga 
gggaatattg 
aaaatgaaag 
cgtgacggta 
ctcgatacca 
cgaaatggat 
ctatttcacg 
aatgggaaag 
tttccttctt 
tatgatatta 
taatccatcg 
agttgaacaa 
gcgcctaagg 
ctgcgccaaa 
gcctaaaaag 
gacaatgtgg 
actacaagaa 
cgcaatgtgt 
ggtgaacgct 
gaacagaaaa 
ttaaaagaaa 
gaagatgaaa 
gcaatcgagc 
tcacgcagaa 
caagaggctc 
aaaaaattgg 
tttcacgtga 
atgacgttat 
ggcgccttac 
atctacgtcg 
aatgaacaaa 
gctgtaaagg 
aatgttttcc 
ggccgctgaa 
gaaattttag 



gacggaaata 
gaaaatgaat 
gaaattcttt 
ctggtaaaac 
tgatgctgct 
gacagcgagt 
tgaaaacatt 
tgacactatt 
cttcaattta 
ttattgggaa 
tcatcacgtt 
gttgaagcta 
aagacttgaa 
agatatttca 
tatcctactc 
ctaaaccgat 
ttaaatctac 
acggaagacc 
gatgacattc 
•atagcctgtc 
agccatgact 
ctaccttata 
acgaccgagc 
tggcgcagaa 
tcgctaaagg 
tcaactgtct 
tgaggcccat 
gcgcgtgtct 
gacattcgaa 
atctcaatca 
agggctcgtt 
tggaaaaggc 
agttgctaaa 
ttttgtgtct 
tactctagaa 
ccattgtcaa 
gcaggcaatt 
tatcaaaatt 
agcagttctt 
cattttaggt 
ttgtataaaa 
ggacagccta 
aatttctcga 
gggtctaata 
tcgagaacct 
tggaagctat 
accatggtag 
acctttgcaa 
ctatttagca 
caacgtttca 
gtccttatgc 
cgcagatatg 
gaggctgagg 
aagaaaaagt 
agaggaagag 
actgaaaagg 
tggtacctgc 
agatgtcgac 
cctgtattag 
tgccggaaga 
acctaaagca 
ggttgaaata 
gtggctatgt 
ggtagaaaag 
gatttaggca 
tcgagaatga 
caaccttgca 
tttatgctaa 
agaggacatc 
aatcctgaca 
acgcaacttg 
cgaatcgtcg 
accacggcga 
tgaaagccaa 
attggatacc 
atggggataa 
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15401 

15471 

15541 

15611 

15681 

15751 

15821 

15891 

15961 

16031 

16101 

16171 

16241 

16311 

16381 

16451 

16521 

16591 

16661 

16731 

16801 

16871 

16941 

17011 

17081 

17151 

17221 

17291 

17361 

17431 

17501 

17571 

17641 

17711 

17781 

17851 

17921 

17991 

18061 

18131 

18201 

18271 

18341 

18411 

18481 

18551 

18621 

18691 

18761 

18831 

18901 

18971 

19041 

19111 

19181 

19251 

19321 

19391 

19461 

19531 

19601 

19671 

19741 

19811 

19881 

19951 

20021 

20091 

20161 

20231 

20301 

20371 

20441 

20511 

20581 

20651 



cattattcgc 
agttagtgtc 
gaggaccacc 
tctttatcct 
aaacaagaga 
acaaaggtga 
cagcttgaaa 
atcgctttcg 
tagaggatag 
ccatgcaact 
tgaccctatt 
acaaacctcg 
actacctaac 
ttgataaatt 
ccttgttcat 
caattctagc 
tactgcaatg 
gtgaccttat 
cgtctttcca 
ttgacgcttg 
aagttgaact 
aaaattaagt 
atttcgtgga 
caatcctttc 
aatagtttga 
caggaaagca 
ttagaatatc 
ccaaatcttc 
gtttcgaact 
tcccactcta 
ttgccatttt 
tgaacttaac 
ttggccgttt 
ttcattttac 
aagacgttct 
acaactttca 
tatagtatta 
tttttttttc 
cctcatagcc 
tactttaaag 
ttggaaaact 
ctaaaaagtt 
aaggctgaca 
ggctctgctc 
gggcgtctgc 
aattccttca 
aggtcgaaat 
cttttacatt 
cttttcgttc 
tgcttctcgc 
ccagttatgg 
atactagcct 
ttgtagacga 
catgagtttt 
caatccataa 
tgaaagcgcg 
caaaagtaag 
atagtcgcgc 
ttgcattctc 
tccttccttc 
agtgaatatt 
tcttgtagga 
tattttagac 
tcacgctgat 
atattcgacc 
ttgagcaagt 
tcactcgggt 
tagtcacttt 
gtccaagcgc 
tacatttttc 
tagagccttt 
ttaccaagat 
tacataaata 
tttaaaactg 
cgcctaagac 
tgaaaatttc 



tctaaacacg 
tttgtactgt 
gtagtaccgt 
cgcctatctg 
aggcagttgc 
cgtcgtaaca 
aaggaatggt 
actttggtga 
aaatgataac 
gtacgcagac 
gttcgagaaa 
accagaatga 
caagctacaa 
ccagcaattt 
ttcttgcttt 
atcaacttcc 
tcaagttcgc 
attgtttctc 
atctgctgta 
ttttatttat 
tttttaaata 
tcatcttcat 
ctcctttttt 
gagtcgcttt 
atggcttcaa 
aagcgttcca 
tttgtagtca 
gtcctcgtca 
cgaatgctaa 
aatcgtcgta 
agtttcctcc 
ttggtcgacc 
tcgttgataa 
tacctccact 
aggcttaccc 
ttcctacttg 
ttatacgata 
aaaaaaataa 
tttacgacgt 
tcatccgcct 
cacctatatt 
gtccaaggtt 
atttcactgt 
cgctatctag 
acgcgcaacc 
aaatagctct 
atacttgaat 
tacttttttc 
tttgccatgc 
gatgcaatag 
tggcgtcaat 
tttataatag 
taaggagttc 
gaaaatggat 
ttgaaaaggc 
attaggtcat 
cgacatttcc 
agaataaact 
gccatgaaac 
tttaaatttc 
cttccacctg 
aggttcgcga 
actaattcag 
taatacaaaa 
tgcttctttc 
gcgatattat 
tgtcatttgc 
ctatcatatt 
gacaagtgtc 
aatatctact 
tcataacttt 
tatcaaaatc 
gaagcagttt 
tcgcttcagc 
ttcagcttgg 
atcttatctt 



gaatcgaaat 
cagccttttg 
cgcccttgta 
ccatgacatc 
taagcagttg 
gactcaatgc 
tcctaaaaaa 
cggaggcgaa 
cttatttaaa 
cctattccta 
acgtacttga 
tgtcgacgat 
agtcaacaaa 
gacgagcgca 
aattctttcg 
atgtcgcgag 
tctttctaat 
agtttctttt 
agataaccga 
attatgatta 
tttttttttg 
aagcaagaat 
aagttcgtcg 
tcattttgtg 
aaaagtccgt 
gctagtgatt 
atatcagctt 
tcgttttcat 
ggacttccat 
gtcgaagata 
ttatgcgata 
gtttcttcca 
tttcgtacca 
tttccgtcca 
atttacgacc 
caaatcttta 
atgagtgaat 
cgagccgaag 
gctacctttc 
tggcatagtc 
agcacaacgc 
ataggaaggt 
ccttaaatag 
tacatcgcca 
tggagctcct 
tgtccgggtc 
ttcatctgta 
gagagatttg 
tagtatctcc 
tttcgagaat 
taagtaacca 
ccatttcctg 
ctggaacttc 
aactttccat 
ttaccttctc 
ctaggctgtc 
aactttctct 
tcgaatttca 
cgcccttcaa 
gaaatgtgtc 
ctttttaaat 
gtaggaagtc 
cgtcttgttt 
gcacctaaaa 
ccaacagctt 
tctttagcat 
taattgaata 
ttcgagcttt 
gaaatgaaat 
tcaagttcga 
ctgctaggta 
agtggcgtga 
tatcttccaa 
tacaacatta 
tcattgttca 
ccctttattt 



taaggagaaa 
catgacttgc 
tttggcgctc 
acgcgcatac 
ggaggaaaag 
ttatagaatg 
tgaacaggaa 
cagtatatag 
ataaacagtg 
tacaagagga 
gctcatttca 
ttcctacagc 
agcaaaataa 
atcttctagc 
ttaaggcgtt 
taagtgtgac 
aactgagcct 
acaggaatgc 
aataaagtgt 
tacgataata 
aaaataaaaa 
ctgtccgtac 
atagtacagt 
tatcaattgt 
tattgaaact 
tgaatttgag 
cagtatgatt 
agcaggcgat 
gtcctcaaca 
gttacaagac 
tatagtttga 
tgtattcgcc 
ccattcatca 
ttagtgattc 
tcgacggtca 
acttttacca 
aaagtcaagt 
ctacgttatt 
cagccttaga 
gage agg age 
aaaacaagtg 
cctttggaaa 
ttcaccgtct 
accgtgtgac 
taacagtcat 
aatagtgcct 
ttaggcagee 
tagggataag 
atttctgttg 
atgcctgttc 
tctattgact 
cgcgtgtttc 
gaacaggagc 
ttattttcca 
tataaggecg 
tagctcgagt 
agtgcttcac 
ttttagttac 
tataegctte 
ctgaagegea 
cgaatggcta 
ggtcaatacg 
ttcgccgaag 
ttagtcgega 
gagaagtctc 
caacttttga 
agatttttaa 
cgaaaagtca 
aggctacaaa 
gaacgacaat 
aataactcca 
taaagtttca 
gtcctactca 
geaaagttcg 
ctaccattag 
gtctttcttt 



cttgatgaat 
tcaatggttt 
tegtaagetc 
aaaccaactc 
tacagectaa 
caagacagtt 
aggttcgetc 
caatgtctat 
aaggaacagt 
cgatatacag 
eggagcegtg 
.aegecaaaga 
atagacctat 
gcagatacta 
cgattcttgt 
tccagtttca 
aggtccaagt 
tttcatagtg 
tgtttccata 
aaggaataaa 
gecctaataa 
tggtaagaaa 
tacaatgacc 
tttcgagtct 
cctttataag 
ggttaggaga 
gttgataaat 
aacttcaacc 
tcttcgaatc 
gtccgtcaaa 
taatttgaga 
catgtcttcg 
ccgaattgtt 
gttatcatag 
gttactttaa 
ttttatatga 
gtttttgtaa 
tatttatctg 
gccgggtgaa 
tggatagctt 
ctctagtatg 
ctcataaggc 
ttatacataa 
aataggcttt 
ccaaggctga 
aacattgtca 
acttaacagt 
cattctcctt 
gtcttgcttt 
ataggctcac 
ccttaccata 
aattttaact 
ctcctttttt 
tagtttcacc 
tgataatttt 
tcgattacaa 
gatacctatc 
cgccttccaa 
aagattgaag 
ttttttgttt 
aggctgacaa 
gtaacgaaga 
aaaattattc 
gaatatgacc 
gaactgttta 
gecataagaa 
ttttttcaat 
atgtcgtcta 
acatcttttc 
agtatcaaca 
gctgaaggct 
ttagttactt 
atagcttcct 
aacegttgag 
gtattcatta 
atactattat 



tatatggtaa 
atttggttac 
tgcggcgttc 
ccacgcgcag 
ttcaggagee 
atgaagecac 
aaaaactcga 
aagtcagttc 
tactccaatt 
ttcgttgata 
taggagtttc 
agaagegetc 
ttctaggtct 
ggtggcggct 
agttaatttc 
gegacaggae 
acaagttagg 
gaaagtgtag 
attgacctct 
gtcaagcact 
tagagctttt 
tagctgattc 
tattcttgac 
aggtgagtga 
aaagctcatt 
gtttcgataa 
accttcattt 
cactcgtcgt 
cttcattagg 
ttttactgtt 
ttcgatgtca 
attcttccgt 
tgattgette 
aaccgaatac 
attcagtacc 
ctcctttatt 



acttttttaa 
ctcaagggct 
aagtcccaaa 
tttgccattt 
ctggctagac 
tctttgacat 
taccttgaac 
aagaactgea 
ggtttcttac 
gcctgttttt 
gacttttcta 
ttgacattta 
ttagctctgt 
aatattcege 
aaatacaaaa 
aagctcattt 
catcgtctac 
ttattccatg 
agtccagttc 
ggttgccagt 
atatgtcgee 
aatttcatcg 
tcatgttgag 
getegctagg 
aaagcctttg 
taaagcaaag 
gacttttatt 
aagttcacgt 
ggttcatcaa 
gggcagtttg 
aattttttcg 
cttcaattgt 
attatggtcg 
tttcgaagcg 
tcaatccttc 
ccttacatat 
ettegctgag 
aatgttttcg 
gtaagtgctt 
tatacaataa 



aagtcattct 
aggtaagcaa 
tattcgacac 
agetagtget 
actgactact 
aaagttcagt 
etattctget 
aagegaatat 
aaagggtcag 
taactggact 
aaaatatggt 
gaetttgeta 
atttttatta 
ttcttgttta 
ttgatgattt 
atgctttgaa 
attgattcca 
ttcttgtgac 
ttctgcgtcc 
ttttacaaaa 
agtttagcag 
aatatcegge 
tgaagttcct 
aggaacttgc 
ccgtgtatag 
gctacaaaat 
tataaccctt 
cctcaccttc 
tgeatatect 
tcctttactg 
ccatagttga 
cttgaatcat 
tttaactgtt 
gtccatcact 
ttttgcattt 
tgtttttctt 
attttttaat 
tgttgaattg 
cagtttcgtc 
ccgccaattc 
ataatgaact 
cgtatttgaa 
aatttcagta 
aaaaaacctg 
aaacaatcct 
atttatataa 
taagcgattg 
ctttttttcg 
tcagttcagc 
caaagatttg 
tegtcttgge 
tcacccaaac 
ttgtttaata 
tacccgtcaa 
ccactacatt 
atcaatttca 
tettegtcaa 
ggcataatct 
gtctgtcaat 
taggaccata 
aggtatgaat 
cagcctcata 
caagegcata 
tcccaccaaa 
attgttcaac 
cccctcttcg 
ttattcatat 
cttgtcataa 
aaactttcag 
ataaaaaggc 
agctagaatt 
ctagagtcac 
tttttcgagt 
atatttcctg 
tagcaaagtt 
tgattgaata 
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20721 aagtaaagca ttttttataa aaaagttgaa ctttttttac aattttttga actatttaaa aattataaaa 

20791 tgggtggaaa atttaggcga caatttatac ccattttcaa cctcatttat aaacaatcta atatagaaaa 

20861 ggacttaata agtaaataaa aaagcgccct gaaaatacct acaaatccca tagtccgtaa gtaaaaacaa 

20931 aaattagggg cgacataaaa gtcgagcact atcttaatct attaccagtc tcatatacaa tcgacacaga 

21001 tttagcaggc ttttagcaaa ctttcgaaca gcatgaaaaa gcatacaatt agaggaacag attatagaaa 

21071 aagcacttcc acaaacaagt tctcaaaatg ctctcaaaaa ccgtaaaatt agtaagtttg aacttttcga 

21141 acttctaaac ttttcgaata atcgagccta atttagaggt cgaaaaactc aatttctcga aaagtcgaac 

21211 ctgctcgaaa .acctcaaaac actcgaaaag tcgagcatag aaaggggtcg aaaagtcgag aatgctcgaa 

21281 aaactcaacc ggttcgaaaa cctcaatcct tcgaaaagtc gaaccattcg aaaagttcaa aagttcgaaa 

21351 aactcaacca ttcgagagta ggaattaagg acataccagt tcaacctttt tagcttcaaa atcactcttt 

21421 ttctcattat aggactataa attcagtcaa ttgtaagtca cgcgcaaatt tgttacaatg taaacgataa 

21491 aatataaagg agggtcaata aatggcgaaa gctactggac -caaaagttcg aagaggaaaa actcctccac 

21561 ggccaaaaga caaaaaagga atcaaagcaa atgcgcgtgt caataaagac cagttcgtag agtatgacta 

21631 taaaggcatc aagatgacaa ttaaggaacg tgatgctaga atgaaattgg aatttattag aggcatgact 

21701 attcaggaaa ttgcagcccg ctatggatta aatgaaaagc gtgttggcga aatacgggct cgcgataaat 

21771 gggtgaaggc taagaaagag ttcgagaatg aaaaggctct tgttactaat gatacattga ctcaaatgta 

21841 tgcagggttt aaagtctcag tcaatattaa atatcacgcc gcctgggaga aactaatgaa catcgtcgaa 

21911 atgtgtttag ataatcctga cagatattta tttactaaag aaggaaatat tagatggggc gcattagatg 

21981 tcctttcgaa ccttatagat agagctcaaa aaggacaaga aagagcgaat ggaatgcttc cggaagaggt 

22051 tcgatataga ctacaaattg agcgcgagaa aattacattg ctccgggcca aaatgggcga ccaggaaatt 

22121 gaaggcgagg ttaaagataa cttcgtagaa gcactagata aagcagctca agccgtttgg caagaattta 

22191 gtgacgcaac aggttcctac attaaaggag tgactgataa tgacaataag cctgagaaat aaactaccta 

22261 agttcaactt cgtccctttt agtaagaaac aactccagct cctaacatgg tggacaaagg gctcaccttt 

22331 tcgaactttc gatatcgtca tagcagacgg ttccattcgt tcaggaaaaa cagtatcgat ggctctttca 

22401 ttttcccttt gggccatgac ggaattcaac ggacaaaact ttgccatctg tggtaagaca attcactcag 

22471 ctcgacgaaa tgttattcag cctctaaagc aaatgctcac aagtcgcggg tatgaaattc gagatgttcg 

22541 aaatgaaaat ctacttatta ttagacactt tagaaatggc gaagaaattg tcaactactt ctatatattt 

22611 ggaggaaaag atgagtcgag tcaagacctt atacaggggg taacattagc aggtatcttc tgtgatgagg 

22681 tggcactgat gcctgaatcg tttgtcaacc aagcgacagg gcgctgttcc gtaacaggtt cgaaaatgtg 

22751 gttctcttgt aacccggcca atcctaatca ctacttcaag aagaactgga ttgacaaaca ggtcgaaaag 

22821 cgtatcttat atcttcactt tacaatggac gacaacccta gcttgacgga tagcattaaa aggcgctatg 

22891 agaaaatgta tgctggagtc ttcaggaaaa gatttattct cggcctttgg gtaacagcag atggtctagt 

22961 ttattcaatg ttcaatgaag agcagcatgt caaaaagctc aatatagaat tcgaccgttt attcgtagca 

23031 ggcgactttg gtatctataa tgcaacaacc ttcggccttt atggattctc gaaacgtcat aagcgctacc 

23101 atctaattga gtcatactac cactcagggc gcgaggcgga agagcaacta actgaggcgg atgttaattc 
23171 . gaatattcaa tttagttcag ttctacaaaa gactactaaa gagtacgcaa atgatttagt cgatatgata 

23241 cgaggaaagc aaatcgaata tataattctc gacccgtctg cttctgctat gattgttgaa cttcaaaagc 

23311 atccttatat agctagaaag aatatcccta tcattcctgc tcgaaatgac gtgacgcttg gcatttcatt 

23381 tcacgctgaa ctcttggctg agaatagatt tacactcgac cctagcaaca cgcacgacat tgatgaatac 

23451 tatgcttaca gctgggacag taaagcgagc caaacgggag aagatagagt cattaaagag catgaccact 

23521 gcatggatag gaacagatat gcctgtctca ctgacgctct aatcaacgat gacttcggtt tcgaaataca 

23591 aatattatcc ggaaaaggcg ctagaaacta actaaacact tctatagaaa ttagtgtata atataagtag 

23661 gaggatttta aacatggcta aaaaatcaaa agctatctca cacacagacg aactgattag tcagtcgttt 

23731 gacagcccct tggcaaagaa tcaaaagttc aagaaagagc ttcaggaagt tgaaaagtat tatcaatact 

23 801 tcgacggatt tgatgtcacg gacttgaata ctgactatgg gcaaacatgg aagattgacg aagactcagt 
23871 cgactataaa cctactcgag aaattcgaaa ctatattcga caacttatca aaaagcaatc acgctttatg 
23941 atgggtaaag agccagagct tatctttagt ccagctcaag acaatcaaga tgaacaggct gagaacaagc 
24011 gtattctatt cgactctatt ttaaggaatt gtaaattctg gagcaaaagt acaaatgcat tagtcgacgc 
24081 cacagtaggt aagcgggtat tgatgacagt agtagcaaat gccgctcaac aaattgacgt ccagttttat 
24151 tcaatgcctc agttcaccta tacagttgac cctagaaacc cttccagctt gctttctgtt gacattgttt 
24221 atcaggacga gcgtacaaaa ggaatgagca ctgaaaaaca actttggcat cattatagat atgaaatgaa 
24291 agctggaaca agtcaatcag gaattgcaac agctttagaa gacattgaag aacaatgttg gctcacttat 
24361 gccttaacgg atggagagtc gaaccaaatc tatatgacag aaagtggcca aactactatc aaggagacag 
24431 aggctaaact tgtagaaatt gaagacaacc taggaaacaa gattgaagtt cctttaaaag ttcaagaatc 

24 501 cgccccaacc ggcttgaagc aaattccttg tcgagttatt cttaatgaac cattgactaa tgacatatac 
24 571 gggacaagcg atgtcaaaga ccttatcaca gtagcagata acttgaacaa aactattagt gacttacgag 
24641 attcacttcg atttaaaatg ttcgagcagc ctgttatcat tgatggctct tctaagtcaa ttcaaggaat 
24711 gaagattgcg ccaaacgctt tggtcgacct taagagtgac cctacctcct caatcggcgg tactggaggc 
24781 aagcaagctc aagtcacttc catttcagga aacttcaact tccttccagc ggctgaatat tatttagagg 
24 8S1 gcgctaagaa agccatgtat gaactaatgg accagccaat gcctgaaaag gtacaggagg cgccatcagg 
24 921 aattgcaatg cagttcttat tctacgacct aatttctcga tgtgacggaa aatggattga gtgggatgat 
24991 gctattcaat ggctcattca aatgctggaa gaaattttag caacagtgaa tgttgacttg ggaaatattc 
25061 ctcaagatat tcaatcaagt tatcaaacac ttacgacaat gactatcgaa caccactatc caattcctag 
25131 cgatgaactt tctgctaagc aacttgcgct cactgaagtt caaactaatg tacgcagcca ccaatcttac 
25201 attgaagaat tcagtaagaa ggaaaaggcg gacaaggaat gggaacgcat tttggaagaa cttgctcagc 
25271 ttgacgaaat ctcagctgga gcattgcctg tattagcaaa cgaattaaac gaacaagagg agcctcaaga 
25341 tgaaacgagt gaagaagacg aagttgatga caaagaaaaa gaacaaactg aacaaccaac cgaagaagga 

-25411 gtcgacccag acgttcaagg ttaattgtga ccattgtgag cataagttcg accttacatc taaacagatt 

25481 atttcgaaac atatcgaaaa gggcgtagag tggagattct tcgaatgtcc taagtgccat tatcggttca 

25551 ccacttatgt aggaaacaag gaaattgaaa accttattcg atttagaaat acttgtcgag ctaaaatgaa 

2 5621 gcaggaactt caaaaaggag ctgctgctaa tcaaaacact taccattcat atcgaattca ggatgagcaa 

2 5691 gctgggcata aaatctcagg gcttatggcg aagctaaaga aggagataaa cattgaaaaa cgagaaaaag 

25761 aatgggtatc tatatagctg ggaaaaggct attcatgaaa ataatattcg tctaaccctt gaacaggaac 

25831 aagctgtact gaaagccttc agcgatgcag gaactgattt aattgcaaag attaaaaagt ctcgaaatgg 

25901 atacttgcct aaaagaatct ataaagacta cgcttacgac ctgcacgctg ttcttgttca actaatgact 

2 5971 gaatactctc ataaggcggc aatgaacgca gtagatggcc aggtagttca tattctacaa gtattagcag 
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26041 aagatggaaa tgctacggct gaaaagttcg aaaaggaagt cagggctgca tctttagtat tttcacgaag 

26111 agcagccgag gcagttgtca aaggtgaaat ctataaggac ggcaaaaacc tctcgaaacg tgtttggtct 

26181 tcagccgcac gcgcaggaaa tgatgttcaa caaatagtca cacaaggcct agcaagtgga atgtctgcta 

26251 cagatatggc taaaatgctc gagaaatata tcgaccctaa ggttcgaaaa gattgggact ttgataagat 

26321 agctgagaag ctagggaaac ctgctgctca taaatatcaa aatctcgaat acaatgccct tcgacttgct 

26391 cgaactacca ttagccattc cgccacagct ggagtgagac aatggggcaa ggttaatcct tatgctcgaa 

26461 aagttcaatg gcattctgtt cacgctccag gtcgaacgtg tcaagcgtgt atcgatttag atggtgaagt 

26531 atttcctatc gaagaatgtc ctttcgacca tcctaatgga atgtgctacc aaactgtatg gtacgaaaac 

26601 tcactcgaag aaatcgctga tgagttgaga ggctgggtag acggagaacc taatgatgta ttagacgaat 

26671 ggtacgacga tttaagttca ggaaaagttg agaaatacag cgacctcgac tttgttaaaa gttattaggc 

26741 tcggttcaat accgagtctt tttgtctata aattgtctaa tttcgagaac cttcgaaaag tagtaaaatg 

26811 atattcagtt atgttataat ataagttgaa aaggaacctt -gtcgccttaa tgactcgaaa ttggtttcac 

26881 tgttccaatt aaataaaaac agcagattca gccggagggc ggaaaactca ggaggaaaat aaatggctta 

269S1 tcaattagaa gacttgttaa aaggtctaga tgaaccaact atcaaacagg tgaaggaaat tatttcgaaa 

27021 acttcgaaag aactcgatgc taaaattttc attgacggcg acggtcaaca ttttgtacct cacgcacgtt 

270 91 tcgatgaagt tgttcaacag cgcgatgcag ctaacggctc aattaattct tataaagaac aagtcgcgac 

27161 gctttctaaa caggtcaaag ataacggtga tgcgcagacc actatccaaa accttcaaga gcaactcgac 

27231 aagcagtctc aacttgcaaa aggcgctgtg attacttcag ctcttcatcc gttgattagt gactccattg 

27301 ctccagcagc agacattctt ggatttatga accttgacaa cattacggtc gaaagtgacg gtaaagttaa 

273 71 aggtcttgat gaagagttga aagctgttcg tgagtctcgt aaatacttat tcaaagaagt cgaagttccc 

27441 gcagaacaag aggctcaagc taagtcgcca gccgggactg gaaatttagg aaatccaggt cgtgtcggtg 

27511 gtggtgttcc cgaacctcgt gaaatcggct cttttggtaa gcaacttgct gctgctcaac aaacggcagg 

27581 agcacaagaa caatcatcat tctttaaata ataggaggaa ctaactatgc ctaatgtgcg agttaagaaa 

27651 actgatttta atcaaaccac tcgaagcatt gtcgcaattc ctgaccacta cgttgctttg gctgctcaaa 

27721 ttccagctac cgcagcaact caagtaggga acaagaaata cattcttgcc ggaacttgcg tgaaaaatgc 

27791 tactacattt gaaggacgca aaactggact cgaagtagta tctaccggtg aacaattcga cggagttatc 

27861 ttcgctgacc aagaagtgtt tgaaggtgaa gaaaaagtaa ccgtgacagt attagttcac ggattcgtca 

27931 aatatgcagc cctccgaaaa gttggcgatg ctgtgcctga atctaaaaac gcaatgactc ttgtcgttaa 

28001 ataggaggaa ttatagatga atatttatga ttatatcaac gcaggggaga ttgctagcta cattcaagca 

28071 cttccttcaa acgctcttca ataccttgga ccaactcttt tccctaatgc tcaacaaaca gggacagaca 

28141 tttcatggct caagggtgca aataatttgc cagtaactat ccagccatct aactacgacg cgaaagcaag 

28211 tcttcgtgaa cgtgctggat ttagcaaaca agctactgag atggcattct tccgtgagtc tatgcgactt 

28281 ggtgaaaaag accgtcaaaa cttgcaaatg ctattgaacc aaagttcagc tcttgcccaa ccacttatca 

28351 ctcaactcta taatgatact aagaaccttg tagacggtgt tgaagcgcaa gcagaataca tgcgtatgca 

28421 attgcttcaa tacggtaaat tcactgtcaa atcaactaac agcgaggctc aatacactta cgactacaac 

28491 atggatgcta agcaacaata tgcagtcact aagaaatgga ctaacccagc tgaaagtgac cctatcgctg 

28561 acattttagc agcaatggat gacatcgaaa atcgtacagg tgttcgccct actcgaatgg tcttgaaccg 

28631 aaacacttat aaccaaatga ctaagagtga ctctatcaag aaagctcttg caattggtgt tcaaggttct 

28701 tgggaaaact tcttgcttct tgcaagtgac gctgagaaat tcatcgctga aaaaacaggt cttcaaatcg 

28771 ctgtctactc taagaaaatt gctcagttcg ctgacgctga caaacttcct gacgttggta acattcgtca 

28841 gttcaacttg attgacgacg gtaaagtggt attgcttcca cctgacgcag ttggtcacac ttggtacggt 

2 8911 actactccag aagcattcga cttggcttca ggcggaacag acgctcaagt tcaagttctt tcaggcggac 

28981 ctaccgttac aacttatctt gaaaaacatc ctgtcaacat tgcaacagtt gtatcagctg ttatgattcc 

29051 atcattcgaa ggaattgact atgtaggagt tctcacaact aattaggagg tcgctatatg gctacattga 

29121 aagctcttag caccttaatc gtttccggag cagtagtgca ttcagggtcg gtattttctt gccctgaagc 

2 9191 gcttgcttcg tctttaattg aacgcaattt tgcgttcgag attaaggcgg ctgaagatgg agaaacggta 

29261 gaaactgttc ctcaaacaat tgaatcagtt gaagaaattg acgaagttga acaaatgcgc gaagagtatg 

29331 cggctaaaac cgttcctgag ctcgttgaat tagcaagagc taatggaatt gacatttctt caatttctcg 

2 9401 aaaaagcgaa tatatcgacg ctttaattaa gtacgaacta ggagagtaaa atggcagctc aaacggacat 

2 9471 tgaattagtc aaaatcaata tcgataacga taattctccg tcaccaatga ctgaccaaag tatctcagct 

29541 cttttagaca agcataaatc tgtcgcctat gttagttata tgatttgctt aatgaagacc cggaatgacg 

2 9611 tggtaaccct tggacctatc agtctaaaag gtgacgcaga ctactggaaa caaatggcgc aattctatta 

2 9681 tgaccaatat aagcaagaac agcttgaaac tgatgaaaag ccgaacgctg gttcgacaat cttaatgaaa 

2 9751 agggctgatg ggacatgagt tatgacgtga attatgttaa gaatcaagtt cgtagagcca ttgaaaccgc 

2 9821 tcctactaaa atcaaggtac ttcgaaactc ttgggtcagt gatggatatg gaggaaagaa aaaggataaa 

29891 gcgaatgaag tcgtagcaga cgaccttgtt tgtttagttg ataattcaac tgttcctgac cttttagcca 

2 9961 attctactga cgcgggaaaa atttttgccc aaaatggagt gaaaattttc attctatatg atgaaggcaa 
30031 aatcattcaa cgagccgata ctatcgaaat taaaaactca ggaagacggt acagggtagt agaaacccac 
30101 aatcttctcg agcaagacat tttgatagaa cttaaattgg aggtgaacga ctaatgtctc agcctgaatt 
30171 agtatggaag cctgaagaat ttgttagtaa ctgtgaacgg tatcgaaaca agtttcaagt cgctgtcata 

3 0241 acagtctgcg aagtcgctgc tactaagatg gaagaatacg caaagacgca tgctatttgg acagaccgta 
30311 cagggaatgc tcgacagaaa ctcaaaggag aagctgcttg ggtaagcgca gaccaaatca tgatagctgt 
30381 atcacatcac atggactacg ggttttggct agaactagct catggtcgaa aatacaaaat tctcgaacag 
30451 gctgtagaag acaatgtcga agaacttttt agagcgttga gaaggttatt agactaggag tgaacatgac 
30521 taaacgaacg acaatgatgg acagattgaa ggaaattctt cctacatttc agctctcgcc tgctcctatg 
30591 cttccaggag ttgaatttga cgagcaagat acagataggc cggatgacta cattgttctt cgatatagtc 
30661 atagaatgcc cagcgcaaca aacagcctag gaagttttgc ttattggaaa gttcaaatct acgtccattc 
30731 aaactcaatt attggtatcg acgaatatag cagaaaggtt cgaaacatta tcaaggacat gggctacgaa 
308O1 gtaacctatg cagaaactgg tgactactcc gacacaatgc tttctagata ccgactagaa atcgaatata 
30871 gaattccaca aggaggaaac taataatgag taaagacatt ctttacggaa ccaagctcgt gcaaatcgag 
30941 gagcttgacc cattgactca gttgccaaaa gtcggcggag ctaactttgt cgtagatacg gcagaaacag 
31011 cagaactcga agccgtgacc tcggagggaa ccgaagatgt gaaacgcaat gacacgcgca ttcttgctat 
31081 cgtgcgtact ccagaccttt tatacggtta tgacttaaca ttcaaggaca acacgtttga ccctgaaatc 
31151 atggccctaa ttgaaggtgg tacagtacgt caacaaggcg gaactattgc tggatacgac accccaatgc 
31221 ttgcacaagg tgcttctaat atgaaaccat ttagaatgaa catctatgtg ccaaactatg taggtgactc 
31291 aattgtcaac tacgtgaaaa tcactttgaa taactgtacc ggtaaagctc cagggctttc aatcgggaaa 
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31361 gagttctacg ctcctgagtt caacatcaag gcacgtgaag caaccaaagc aggtttgcca gttaagtcaa 

31431 tggactatgt ggcacaactt ccagcggttc ttcgtcgcgt gacattcgat ttgaacggtg gaacaggaac 

31S01 cgccgacgca gttcgagttg aagcaggtaa gaagatttct ccaaaaccag ttgaccctac cttaacaggt 

31571 aaggctttca aaggctggaa agttgaagga gaatcaacta tttgggactt cgacaaccac atgatgcctg 

31641 accgagacgt caaactcgta gcacaatttg catagaaatt tagaaagaag ggtctgttat gactaatatt 

31711 atcacagctg agcagtttaa gcaacttgca tttcaaatca tcgcacttcc aggattttca aaaggtagtg 

31781 aacctatcca tgttaaaatt cgagcagcag gtgtcatgaa cctaatcgct aacgggaaaa tccctaatac 

31851 gcttttaggt aaagtgacag aactgtttgg agaaacttcg acagtcacta aagacaatgc tagtctagca 

31921 tcaattactg accaacagaa gaaagaagcg ctcgaccgat tgaacaaaac cgataccggt attcaagaca 

31991 tggctgaace tcttcgagta ttcgcagaag cttcaatggt agagcctact tacgctgaag tcggcgagta 

32061 tatgacagat gagcaactta tgacaatctt cagtgcaatg tacggtgaag tgactcaagc tgaaaccttt 

32131 cgtacagacg aaggaaatgt ctaatgtcat agcagtcgct, actgaatttc atattagacc tagcgaggtg 

32201 gtcgggatgc aaactgattt aggcaaatac tgcttcgacg cagcagccgt tgcttatatt agatatttgc 

32271 aggaagacaa gactcctagg tatcctggtg acgaaaagaa aaatccagga ttgcaaatgc ttatggagtg 

32341 actattttca gtcgctcctc tttttgtata tagaaaggaa attacatgga ttttgggtca attgcagcaa 

32411 aaatgacttt ggatatctca aacttcacaa gtcaattaaa tcttgctcaa agtcaagcgc aacggctcgc 

32481 actagagcct tcgaagtcct ttcaaattgg ttctgcttta acaggattag ggaaaggact tacgactgcg 

32551 gttacccttc ctcttatggg aCttgcagcc gcctctatta aagtagggaa tgaattccaa gctcaaatgt 

32621 cccgtgttca agctattgca ggagcgacag cggaagagct tggtagaatg aagactcaag caatcgacct 

32691 tggtgctaaa actgctttta gtgcaaaaga ggcggctcaa ggtatggaaa atctagcttc agccggtttc 

32761 caggtaaatg aaatcatgga cgctatgcca ggggtacttg acctggctgc cgtatctgga ggagatgtgg 

32831 ccgcgagctc cgaggccatg gctagttcac ttcgagcctt tggattagag gcaaaccagg cgggtcacgt 

32901 ggctgacgta tttgctcgag cagcagctga tacgaacgca gaaactagcg acatggcaga ggcgatgaaa 

32971 cacgtcgcac ccgttgctca ctctatgggc ttgagccttg aagaaacggc tgcgtctatt gggattatgg 

33 041 ccgacgccgg tattaagggc tcgcaagccg gaaccacgct tagaggcgct ctctcgcgta ttgccaaacc 

33111 tacgaaagcg atggtcaaat caatgcagga attaggagtt tcgttctacg acgcgaacgg aaacatgatt 

33181 ccactaagag aacaaatcgc tcaactgaaa acagctactg caggactaac acaagaggaa cgaaatcgtc 

33251 accttgttac cttgtatggc caaaactcgt tgtcaggtat gcttgcacta ttagacgcag gtcctgagaa 

33321 attggataag atgaccaatg ctctcgtgaa ctcggacgga gctgctaagg aaatggcaga aactatgcag 

33391 gacaaccttg ctagtaaaat cgagcaaatg ggaggagctt tcgagtctgt tgctattatt gttcaacaaa 

33461 tccttgagcc tgcacttgct aaaatcgtgg gagcaatcac aaaagttctc gaagcattcg taaatatgtc 

33531 acctatcggt caaaagatgg ttgtcatatt cgcaggaatg gttgcagccc ttggaccact gcttctaatt 

33601 gcaggaatgg tgatgacaac tattgtcaag ttaagaattg ctattcagtt tttaggtcca gcatttatgg 

33671 gaacgatggg aaccattgca ggagttatag caatattcta tgctctggtc gccgtgttca tgatagccta 

33741 cacaaaatcg gagagattta gaaactttat caacagtctt gcgcctgcta ttaaagctgg gtttggagga 

33811 gcgttggaat ggctacttcc acgactgaaa gagttaggag" aatggttaca gaaggcaggc gagaaggcga 

33881 aagagttcgg tcagtctgta gggtctaaag tgtcaaaact gctcgaacag tttggaataa gtatcggtca 

33 951 ggcaggaggc tcgattggtc agttcattgg aaatgttctc gaaaggctag gaggcgcatt tggaaaagta 
34021 ggaggagtca tttcaattgc tgtttcactt gtaacaaaat tcggtctcgc atttctaggg attacaggac 
34091 cactcgggat tgctattagt ctgttagttc catttttgac agcttgggct agaacaggtg agttcaacgc 
34161 agacggaatt actcaagtat tcgaaaactt gacaaacaca attcagtcga cggctgattt catctctcaa 
34231 taccttccag tctttgtcga aaaaggaact caaattttag ttaagattat tgaaggaatt gcatctgctg 
34301 ttcctcaagt agttgaagtg atttcacaag tcattgaaaa tattgtgatg acaatttcga cagttatgcc 
34371 tcaattagtc gaagcaggaa ttaagatact cgaagcgctt ataaatggtc ttgttcaatc tcttcctact 
34441 atcattcaag cagctgttca aattatcact gctttattca atggtcttgt tcaggcactt cctacgctta 
34511 ttcaagcagg tcttcaaatt ttgtcagctc tcataaacgg actagttcaa gcgcttccgg caattattca 
34581 agcagctgtt caaattatca tgtcgcttgt tcaagcacta attgaaaact tgcctatgat aatcgaagca 
34651 gcgatgcaga ttataatggg tctagtcaac gcactgattg aaaatatagg acctatctta gaagcaggga 
34721 ttcaaattct aatggcttta atcgagggac ttattcaagt gcttcctgaa ctaattacag. cagcgattca 
34791 aatcattact tcactattag aagcaatctt gtcgaacctt cctcaacttc tagaagccgg agttaaattg 

34 861 cttttatcac ttcttcaagg gttgctaaat atgcttcctc aactaattgc aggggctttg caaatcatga 
34931 tggcacttct taaagcagtt atcgacttcg tccctaaact tcttcaagca ggtgttcaac ttcttaaggc 
35001 attgattcaa ggtattgctt cacttctcgg ctcactttta tcgacagctg gaaacatgct ttcatcatta 
35071 gttagcaaga ttgctagctt tgtgggacag atggtttcag gaggtgcgaa cctgattcga aacttcatta 
35141 gtggtattgg gtcaatgatt ggttcagctg tctctaaaat tggcagcatg ggaacttcaa ttgtttctaa 
35211 ggttactgga ttcgctggac aaatggtaag cgcaggggtc aaccttgttc gaggatttat caatggtatc 
35281 agttccatgg taagttctgc ggtaagtgcg gcggctaata tggctagcag tgcattaaat gccgttaagg 
35351 gattcttagg tattcactct ccttcacgtg tcatggagca gatgggtatc tatacgggtc aagggttcgt 
35421 aaatggtatt ggtaacatga ttcgaactac acgtgacaag gctaaagaaa tggctgaaac tgttactgaa 
35491 gctctcagcg acgtgaagat ggatattcaa gaaaatggag ttatagaaaa ggttaaatca gtttacgaaa 
35561 agatggctga ccaacttcct gaaactcttc cagctcctga tttcgaagat gttcgtaaag cagccggttc 
35631 gcctcgagtg gacttgttca atacaggaag tgacaaccct aaccaacctc agtcacaatc taaaaacaat 
35701 caaggcgagc aaaccgttgt caacattgga acaatcgtag ttcgaaacaa tgacgacgtt gacaaactgt 
35771 cgagaggatc gtataataga agtaaagaaa ctctatcagg gtttggtaac attgtaacac cgtaaaggag 
35841 aaatagacgg ctagcagaca gacgctattg gtcgacggaa ttgaccttgt cgacaaaggt gcaaccgtgc 
35911 tagaatatgt aggactcact ttcgcaggat ttaaggactc aggatttaaa aaccctgaag gcatagacgg 
35981 agtattagat tctccgtcta atgctatgtc cgctcttact ggaagcgtga ccttaatgtt ccacggagaa 
36051 accgaaaagc aagttaatca aaaatacagg cagttcaaac aatttattcg ctcgaagtca ttttggagaa 
36121 tttcgacact tgaagaccct ggatactatc gaacgggaaa atttttagga gaaaccgagc aaggaaaact 
36191 tgtagacgtt caagccttta aagatacttc ccttgtagtt aaattaggga ttcagttcaa agatgcttac 
36261 gagtacagcg actcaactgt tcgaaaggtt tataagtttc aacccgcttt gggaggcgat agcttaccta 
36331 acccaggaag acctactcga caatttagag tagaaataag aactacttct caaatcaaag gatattttcg 
36401 aattggcgaa aaaagtccag gacagtttgt tgagttcggt actaattcag tattgatgga aagtggctcg 
36471 attattattc taaatcttgg aacttttgaa cttattaaaa ttagcagtgc aaatcaagcg actaacttat 
36541 ttagatacat taaacgaggc gcattcttca agattcctaa tggaaattca acaattacca ttgaataccg 
36611 agccgatgac gcagcagctt ggacctctac tcttcccgct caagttgaac tgtttctaaa tccgtcttac 
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36681 

36751 

36821 

36891 

36961 

37031 

37101 

37171 

37241 

37311 

37381 

37451 

37521 

37591 

37661 

37731 

37801 

37871 

37941 

38011 

38081 

38151 

3B221 

33291 

38361 

38431 

38501 

38571 

38641 

38711 

38781 

38851 

38921 

38991 

39061 

39131 

39201 

39271 

39341 

39411 

39481 

39551 

39621 

39691 

39761 

39831 

39901 

39971 

40041 

40111 

40181 

40251 

40321 

40391 

40461 

40531 

40601 

40671 

40741 

40811 

40881 

40951 

41021 

41091 

41161 

41231 

41301 

41371 

41441 

41511 

41581 

41651 

41721 

41791 

41861 

41931 



tattagaaag 
atatgaccaa 
gtgactcgag 
taaaggttga 
tgtcaaaggg 
ttgaaacacg 
gactagtttg 
ttggcatctt 
caagaggtta 
ttgtagttga 
gttgacaggt 
tatctcattg 
acgaacattt 
actaattgga 
gtcgacgacc 
caaactctac 
agtcctttca 
aatcttaatg 
ctgacgactt 
tgatggagtc 
tccgtttccg 
tctcgtggac 
agggcaagac 
gtcatgtatg 
tcccaggtgg 
ttcagtttca 
ggaatagggt 
gggcttcaca 
ttcaactacc 
gctggtaagg 
cgcctacctc 
ttggaactat 
ggagttcaag 
cgcaatatac 
agcatacgtc 
aaatggaagg 
tccatatagc 
tatgggttat 
cttgccaatg 
gctattctag 
acgtcaacgg 
aaactgacct 
gtatcagttt 
cgtatttacc 
aattgtaccg 
tcgaacttgg 
agccgatcaa 
ctgaaagcta 
atgaagaagc 
agaacttggc 
attatcggta 
ggaatgaagt 
agtcggccga 
ggagaataac 
agtcaggacg 
gaacgtggac 
cccagactac 
gacgggacaa 
tctctactaa 
tctaggatct 
gttttcggta 
acttagcaag 
tacgcaaatt 
ttttcgggca 
aaatcatgtc 
tcacgctgag 
aatggctccg 
ctatcaatgt 
aattatccaa 
caaattacct 
cgttcactac 
ttacatagtt 
tcagtagttc 
ggccaattga 
taatggagca 
agtaacaaat 



ggaatatatg 
aacttcaatc 
ctcgaggaaa 
aaacattatc 
cttaccaagt 
ttgcttcttc 
tcctcctgac 
cgatatcttg 
gaattgttca 
agagaatttg 
aaaaaggaag 
atgtttcgtg 
tagaattaaa 
tatgaggctt 
attatgatgt 
tatcattttc 
ggggaaactg 
cagaatctgg 
tacatggatt 
gacggtgtac 
gaacgcaaga 
taaaacactt 
ggaaattccg 
caagttcgcc 
tcagtattta 
agaatgggcg 
tgaagtcaac 
agttccttct 
gaaacgggct 
atggggtagg 
aaattggact 
actgatgaca 
gtcttcaagg 
tcacctcgct 
ggtcagtatc 
ggaatgacgg 
ttacgcttca 
tactccgatt 
ttcaagtggg 
ttacaatcta 
ttcaaaggtg 
tttctttagg 
ctgggctaag 
gcaaccttaa 
ctgaagcaat 
taatatctct 
aagctaacta 
aggctacaat 
tatcaaaaaa 
gggctacggg 
agaacgacgg 
tatgtacctt 
tttagaacgg 
atgacaaaat 
taacgaacaa 
ttatggaaat 
gacacgtccg 
agacaatgtc 
ttacacttta 
tcacatacgg 
gcgactggat 
gtacttacct 
ggtagtgacg 
tttctttagt 
gaacattcaa 
ctcgtaggta 
ctaccgtaag 
tatagaatac 
gctcttcgaa 
tctccgtggc 
tatttcccta 
aaggctaaaa 
ttaactatga 
tgcagcaggt 
ttgaacaggg 
acgaggacaa 



attgacaata 
taattggagc 
agaaactttc 
cagtatggag 
ttacctgcta 
tgtaggcgct 
ggtgctaaca 
caaagcaata 
aaccgttgta 
aaatatgtca 
aaggcagtca 
gtttactaca 
gaaaatttga 
cagcggtcct 
tatcgagtgg 
caagaccctc 
taaatgagtc 
gaaatacatt 
cgactagaag 
ctggaaagag 
gcctgaaaat 
tggagatata 
gaaaagacgg 
atctgctact 
tggactcgaa 
agcagggtcc 
ttcagtttct 
ttaatcaaag 
atcaaaaaac 
aattaagtct 
tctgctattc 
ctagcgaaac 
tcctcaaggg 
ttctctaata 



aagatttcaa 
agctcaaggg 
agtgcagacg 
atgagcaagc 
aggtcgaaac 
atggacggac 
ctaactcttt 
aggagatacg 
gcctctagga 
ccgatcaatg 
tttccatgta 
actcctttta 
accaacagtt 
ggagcagtta 
tcggaagccg 
aactgaagaa 
tagctctacc 
acgcaagggt 
aacaatactc 
ttatcaactc 
ctcctcgcga 
attagtaacc 
gcgaagaggt 
cgtttgggct 
gacagtattc 
ttatctttaa 
agatttaggt 
aaatcaagtt 
tctattcaaa 
agacacgact 
gtcaacttca 
aaaaccaagc 
agcatgggtt 
tatggaccgt 
atgctaaggt 
gccgttgaac 
atgactaact 
tccaagacag 
caaggacggt 
gatatacatg 
gtcaatataa 
ccctacggga 



atttacctat 
aagtgatgaa 
acttttgaaa 
gaagatggtc 
cgcattatgg 
gtcgcgctag 
aacaagttcg 
caatttagaa 
tttcttcagc 
ctaggcagga 
agagccttta 
cgccacatga 
tgagtgctgc 
ttataacaag 
cgaaagatat 
gaaaagactt 
ccaagttgtt 
ggtgtcctta 
gtcctaaagg 
cggagtaggg 
ggatggagcg 
ctgacggctc 
aatcgcaggt 
gaagctccag 
caagatggcg 
taaaggtgac 
tatggaatta 
gtcaatatct 
ctacattcca 
acgaccatta 
caaatgttca 
aggttactca 
cttcaaggaa 
gcccaaacgg 
tcccgtccat 
atacccggga 
gatcacgtga 
agatagcagg 
gagttcctta 
aagatcaaac 
acgacttgac 
cgattaggta 
acggagtgag 
gaagttctac 
ttcactcaaa 
gtgaagcaga 
gacggcactc 
agtaacttag 
acctaatctt 
gttcgtcgac 
attaaggtat 
tcattcacat 
gtttaatcca 
atacggccct 
gttagttggc 
tttccgtatg 
aacgctcgca 
tcgtttgacc 
caaggtctac 
ccgaaaagtg 
aagaaccata 
ccggaacaat 
cggatggagg 
tcagcggttc 
acaatgcttc 
tatcaacgaa 
acagacacgc 
ctatcaattt 
cgcacctata 
actactaatt 
cgtccgcgaa 
gttcacttcg 
cgacttggag 
ctggaggtcg 
cgatgttcgg 
actcgaggtg 



gagtccaatt 
atctttagca 
gtattgaaac 
tcgaattaaa 
tatgaactag 
atattatcaa 



aagcataaca 
ttgacatttg 
cttatgtcga 
agattctcga 
acgtttgctt 
..agcctcgata 
gcgtgcttat 
gttcctgact 
ctgctcgaaa 
gatggacttg 
attagatacg 
atactaataa 
tgacgcaggt 
atagcagata 
aacaagttcc 
acatgaaact 
aaggacggag 
ctggtggatg 
ctacactgac 
gcaggtcgtg 
gtcccactga 
ttggactcga 
aaagacggga 
cctacgcagg 
accgggattc 
gtttccaaga 
ttcctggacc 
tgagggattt 
tcaaaagacc 
agccaggcgc 
gttcagtttg 
gatcgaacta 
attctttatt 



gcaaggacag 
tcaacatgga 
ctccaaccga 
cttagctgca 
gattttaaat 
gttgttcagt 
ggaagacctt 
acggaaaagg 
aaaaggctta 
agcggcaagt 
agttacatga 
caagtgaccg 
cgataacggg 
gacatgaacg 
cttcacttga 
gagctactgt 
gttaaatggt 
agtggagaag 
ctaataacgg 
acagatttct 
aactctttta 
ctactagcgt 
ggacatctgt 
ttcaacatcc 
gacagatttt 
cggcgcttac 
aacggcggca 
gaggaaaaca 
ctccgttcaa 
acggtaggag 
tcacagaaga 
cttagctggt 
actgaattta 
ttggtaaggt 
acaagttcaa 
aataagcgtg 
aatggggact 



cctggcgaaa 
agcattacga 
ctcatctatc 
tatgctcagg 
cagaaggctt 
agacgcaggt 
gccgcagaaa 
gttatgaaga 
gtctaaagta 
aacctgtgta 
ctatcaacaa 
tattgctaaa 
cttgacatct 
tgcatcatac 
aattgactac 
ctaaatgagg 
cagatgacat 
gaaaccgagc 
ttaccgggag 
cagctatcac 
tgaactcata 
ggatactccg 
taggtatagc 
gtctacgcaa 
caaactgatg 
acggtattgc 
ttctgcgatt 
actatttgga 
atgacggtaa 
ctcaacctca 
ttcttgtgga 
taggtgaaac 
tgcaggagct 
agtcatactg 
ctgcagccta 
agacggtaag 
gaagataata 
agtatcgatg 
tgaatttggt 
atatctgcta 
acggtaaacc 
gtggtctaac 
cggccgggtt 
tctttgacaa 
gtggctcaat 
aaatatcgaa 
ctcaactaca 
tgaaggtaga 
cgaattgaag 
gctcttctaa 
aatttctatg 
atctttaccc 
tgattcggta 
acctttacgt 
cgaccgcgat 
tcaagtgttc 
tgactgttcc 
cgttcacgga 
agttctgagg 
cgcatcaagt 
atcctttacg 
attrcgaacct 
ccgattcagt 
aacagggaac 
ggatccacta 
aattgggtat 
atcgaacgtc 
cgtactcgtc 
gtcaacagaa 
tagaggttcg 
aactacgggc 
gtgctacggt 
tgtagaacaa 
cagtttcagc 
aaacagagtt 
atttcaaaat 



ttgttcaagt 
agacgaaatt 
tatcaacact 
acgtagaaga 
gcctaggaag 
gaatgggttc 
attcaatgct 
aattatcaag 
gactttcctc 
cggcttacaa 
cggaagtgaa 
tctaaaagcg 
acagtcgccc 
tcaactaatt 
gacgaccttt 
acggcgaagg 
tttagggact 
gaattagttc 
ctcctgggcg 
ttatgctgta 
aaaggtcgat 
ttgcctatat 
cgcaactgaa 
gttcctaccg 
aaattggata 
aggaaagaac 
cctggagtat 
cctataccga 
aaatggaatt 
ggaacagttg 
cgaaaactgt 
aggtcctaga 
gacggacgtt 
acagcggacg 
tacatggacg 
actaattatt 
atcaacaata 
gtttgaccgc 
ttaaaacctc 
ctattgacga 
gcagaaccaa 
ttagaaggtc 
atcgtagtaa 
agttaattca 
catattaaaa 
ttgactcaaa 
tgacgcagaa 
atgaaagcta 
ctactatcca 
tgaaggtcta 
ttctccgcag 
aatccattca 
tgtaggataa 
cgaacaagtt 
ggagcttatc 
atagcagtca 
tcacaatagt 
aatatcacta 
gaaatcgaaa 
ttggtaccga 
ccgtcactgg 
ataacggaac 
acgtcctact 
aacttcctcc 
tccaagcatt 
gatgaacttt 
caagacgtat 
aaaatcctgc 
aaacatcatg 
gcgtcaggga 
cggacaagtc 
agctaccgaa 
gggaaggcag 
tcactgataa 
tacatggcga 
ttctggttag 
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42001 atagctggaa aatggttcaa tccttcatta 

4 2071 aaacagctgg agacctaaca agtggaaaga 

42141 aaacttgttc ttcaaagtgg gtggaaccat 

42211 acggcatagt atatttgaga ggaaatgtgc 

42281 tcctgaagga tttagaccga aagtttcaat 

4 2351 ctatgtatat acactgacgg aagacttgtg 

42421 atgtctcatt tcgtatttaa tttgagctga 

42491 tgttgaacct tacaaaatcg cgccaaattg 

4 2561 tgtcaaaaca acgattgtga acattgatgc 

42631 gacttgtatg ctgcgaaccg tcgagaactt 

4 2701 tcgaagatga aattctagct gaacagtcaa 

4 2771 tctatgccaa tgtggctaaa cgacacagca 

42841 ctgtcctact aaataagtta ttcgaatgga 

42911 aactcttagc actcttaaac agcaggtcga 

4 2 981 gacgtcattc aagacggaac tagaaaaatt 

43051 taacaggcta tacaactctc gaccatttta 

43121 cggaaatggt gaagttgaag ccttgtatga 

4 3191 gaaactatct aacgaacaat atgacgtagc 

43261 ctaattacag gtcttggagc gttgtatcaa 

43331 caacttttgc aggtactgtt ctaggagttt 

43401 tgaggtggaa taatgggagt cgatattgaa 

43471 cttatagcat ggactttcga gacggtcctg 

43541 ctcagccgga gcttcaagtg ctggatgggc 

43611 ggttatgaac taattagtga aaatgctccg 

43681 aaggtgctag cgcaggcgct ggaggtcata 

43751 ctacgcctac gacggaattt ccgtcaacga 

43821 tacgtctatc gcttgactaa cgcaaatgct 

43891 ctggtttctg gtacgctcga gcaaacggaa 

43961 gtctcggttc tactttgacg accaaggcta 

44031 tggtattggt tcgaccgtga cggatacatg 

44101 tcaatcgcga tggttcaatg gtaaccggtt 

44171 caacggcgac atgaaatcga atgcgtttat 

44241 cgtctggcag ataaacctca attcaccgta 

44311 agaggaggaa gctcttttct taatattgtt 

44381 gtcgtatatt actctattta cttattcgaa 

44451 gttgatatga ccctttccgc cctacataat 

44521 gcttgacaac attcactcat tatcgtataa 

44591 cattatgtca aaaattaaat tcgaaaacct 

44661 aagtttaaaa tcgtttcaat tttagcagac 

44731 aacttcacct ttcagcttca actctcgaac 

44801 agaagctgct aaacctgcta aaaaggctgc 

44871 cccaaaccta aaaaagaagt ccttgaggaa 

44941 cagttagtga gaaatctact gttcgaaaac 

45011 tcttgaaagt cgaattgttg aagcctttcc 

4 5081 cgctctaaga agaacttcgt tactatcgaa 

4 5151 ggttgacaga agaccaaaag aaacttcttg 

4 5221 aatttttaaa ctcgtcaagg aagaagatat 

4S291 tcgctatgat tgaaatcgtt atagcacgtt 

4 5361 ggcaagcact gatgaagatg cagttaaaat 

45431 tcttctaata acttcgaact accttataag 

45501 ttcacatctt cggcgaactt gataaagatg 

4 5571 aagcaatgag cagttttcgt tcaagactac 

45641 gagcatccat gtttcctttt aggcgatgag 

45711 ttagcaggaa ggcaagtttc aaacattgtt 

45781 aaaagaagta ggtattcatt caaatgagtc 

45851 ttagtgattg acggagtttc taaacgggca 

45921 ctaacattga aactcttcgc gatgctgtgt 

4 5991 tggaatggtt attattgacg agattcacaa 

46061 aagctccaaa gttattacaa gatgggactt 

4 6131 atgttatgaa gtggctaggg gcggaacatc 

46201 ccagttcaat caaatcactg gatatcgaaa 

4 6271 agaagaacga aggaagaagt tttagacctg 

46341 cgaaacagtc aaaaatctat aaggaagttt 

46411 gcctaaccct ctagccgaaa cgattcgact 

4 64 81 gatgtcaagt cttgcaagtt cgaaagatgt 

46551 gcgtgatatt tagcaattgg gaaaaggtta 

46621 caacctggta acaggagaaa ccgcagataa 

46691 tctgttattt taggaactat aggtgcgcta 

46761 tcttagatag tccgtggaca cgcgcagaaa 

46831 aagttctgtc actatctaca cgcttgtcgc 

4 6901 cggaaaggag aattagcaga ttatatcgta 

46971 atatcctgct taaatagaat gaaaactatc 

4 7041 acggaagaaa aactgcactc gaactagctc 

47111 tcaaattcct gaaaggacgg caaccagaat 

47181 ataatagaaa ggtatataaa tgaaattcac 

47251 atgttgaacc gctctctatc tacgattaat 



caatgtcagg aagaatgttc atcaggacag cgaacgatgg 
ggttctattt aagcaagact tcgaacagaa taattggcag 
cactcaaccc atggcgacgc attctattcg aaaactcttg 
ataaaggact tatcgacaaa gaggctacta ttgcagtact 
gtatcttcag gctctcaata actcatatgg aaatgccatt 
gtgaaatcga atgtagataa ttcttggtta aatttagaca 
aatcatgtta taatattttt tagaaaggag gtgagaacta 
tggcagagtt cactattgga caaggagctg aaaagaaact 
aaacgcagta tcaaccgtct ctgaaactct tcatgaccca 
cgagctgacg agcaaaaact tcgcgaaact cgttacgcaa 
agactgaaac agctctaaca gctgaataag gaggcgtcaa 
gtcttgacga .cgattattac agcgtgcagc ggagtgctta 
aatcgaataa agccaagagc gttttagagg atatctccac 
cgggattgac caaacgacag tagcaatcaa tcaccaaaat 
caacgttacc gtctttatca cgacttaaaa agggaagtga 
gagagctctc tattttattc gaaagttata agaacctcgg 
aaaatacaag aaattaccaa ttagggagga agatttagat 
aaagaacgtg gtaaccgtag tcgttccagc agcgattgca 
tttgacacta ctgctatcac aggaaccatt gcacttcttg 
ctagccgaaa ctaccaaaag gaacaagaag ctcaaaacaa 
aaaggcgttg cgtggatgca ggcccgaaag ggtcgagtat 
atagctatga ctgctcaagt tcCatgtact atgctctccg 
agtcaatact gagtacatgc acgcatggct tattgaaaac 
tgggatgcta aacgaggcga catcttcatc tggggacgca 
cagggatgtt cattgacagt gataacatca ttcactgcaa 
ccacgatgag cgttggtact atgcaggtca accttactac 
caaccggctg agaagaaact tggctggcag aaagatgcta 
cttatccaaa agafcgagttc gagtatatcg aagaaaacaa 
catgctcgct gagaaatggt tgaaacatac tgatggaaat 
gctacgtcat ggaaacggat tggcgagtca tggtactact 
ggattaagta ttacgataat tggtattatt gtgatgctac 
ccgttataac gacggctggt atctactatc accggacgga 
gagccggacg ggctcattac tgctaaagtt taaaatatag 
tctcttaatc ccgcaaggtt tcgaccctgc ggggttttgt 
gatttcaatt ataattaaat agtcaacatg attcatgatt 
ttgtggggcg tttatttttt ataaaaattt tttacaaaat 
tacaattata aaaataaata aagccgaaag gcgaggagga 
taaaaaaggc gatgttgtgc tacgagctaa atctcaaacg 
gaaaagaaag cagaccttga atcattagaa gacggaggtg 
gttggtacac aatggaagat gaaactgaac ctaaaaaaga 
tcctgcagtt gctcgacctg ctcgaaaagg tagagtcgtt 
gaaattcctg aagttaagga acagccggaa gaagttggtt 
ctgctcctaa aaaagaaagc gtgatggcga ttactaaggc 
tgcgtctact cgaatcgtca ctcagtctta catcgcctat 
gaaactcgaa aaggtgtttc tattggagtt cgcgcaaaag 
catctattgc tcctgcatct tacgaatggg cgattgacgg 
tgacaccgca atggaattga ttgaagcttc tcacctttct 
cgaaagctag gcgaggtcga accctattta ttgaaacatg 
ggcagaaaag atttccagct tgcccaatgt agtcgagacg 
tatttcaata atgttataga cgctctagat gaatgggagc 
ttcaagacta cattgactct cgaaaccgaa tagcttcttc 
tccattcgcg caccaggttg aatgtttcga atacgcacaa 
caaggtttag ggaaaactaa acaggcaatt gatattgcag 
taatcgtatg ttgcatatca gggctcaaat ggaattgggc 
agctcatatt ttaggaagtc gagtcactaa agatgggaaa 
gaagacttgc ttggtggcca cgacgaattc ttccttatca 
tcattaaata cttaaatgaa ctgacaaaaa gcggagaaat 
gtgtaagaac ccttcaagta agcaaggggc ttcaattcaa 
acaggaaccc ctctaatgaa taacccaatc gatgtattca 
atacactgac tcagttcaaa gagcgatact gtatcgtcga 
tccagctgaa cttcgcgagc ttgtcaacga ctacatgctt 
cctgaaaaga ttcgagtcac agagtatgtc gacatgaact 
tgactaaact tgttcaagaa atagataaag tcaagctcat 
tcgacaagcg actggaaatc cttcgatttt aactactcaa 
atcgaaattg tcgaggaatg tatccagcaa ggaaagtcct 
ttgaacctct tgctaagata ctttcgaaga cagtcaaatg 
gttcaacgaa attgaagaat ttatgaatca cagaaaggct 
ggaacaggat ttactttgac gaaagcggat acggttattt 
aggaccaagc cgaagacagg tgtcatagaa ttggcgcaaa 
caaaggtact gttgacgaac gtatagaaga ccttattgaa 
gatggtaagc ctatgaaatc taaaattggt aaccttttcg 
tccatattaa ggaaagacac taaaaggaag ccggacagga 
aagagattga tatgtcacct agtgagttag cagagctcct 
tttaaaactc gacaaactgc tcaacaaaga gcaatgctca 
tgaaggaaaa aattggtata aagttggaga gatatgtcaa 
gtttggtatg aagcaaaaga cttcgctgaa gaaaataaca 
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47321 ttcacttccc gtttgttctt cctgaaccta 

4 73 91 cgaaggcgtg aacaaactca aacgatttag 

4 7461 actcttgtag ggaaaactga aagggaagca 

47531 tggagaatta aatgaaattt gaagatgaaa 

47601 tgctaccaaa ggcgacatgg agaaacaagt 

47671 aatgacattg aatctgctca aggtaagcac 

47741 acgaagaacg cttgaaagaa attatcgaaa 

47811 actttcaggg. cttatcgaat acaagcctgt 

47881 gagattgacc aagaagcaat tcttccagca 

47951 ctaaaattta gcgatatttt tggttctgcg 

48021 caggcaaccg ctgtctgcgt taattttaga 

48091 caaagaatag gcaattcagg aaagcctaaa 

48161 gttctacctt attcaagaag gacgtggcaa 

48231 tgaagcactt aacggaaaac aattcgaacc 

48301 gaatttattt tcaatattaa gtgcatcgat 

48371 gaacttattt aaacattgag tcgaacattg 

48441 aaatgttcga aaaagattga acctaagcga 

4 8511 ttggacgaac tcgaaggaaa aacgggttca 

48581 attttttaaa atgtggttta caaaatgacc 

48651 cggtatatat acaccaataa tcgagaaata 

4 8 721 gaaaatttag ctgatagaat atggaagaaa 

4 8791 agtatttcga acctcaagtg ttagtcgaac 

48861 tcgagcaaat atagtcgaag aagttcgaaa 

48931 gggaaaacta gctgggcggt tcgacttttg 

49001 ttgagaaagg aatgtttgta gtgtcagctc 

49071 catgcaagaa tttctcgaac gtttcgagcg 

4 9141 ggaggttcct taaccaaggc ctcttatcct 

49211 tgtcgactat ttatacgact aattatactg 

49281 tcgtatatat gatacttcag tggttctaga 

49351 attgaatcat agatatagta acatcacaac 

49421 gcggtgtcct attgtgcagg agtgcataat 

4 94 91 agaaagaaaa gtcagccgtc tacttgacag 

49561 caaggaaagt cctctctaca atgaaaaggg 

49631 agcttcaagt cttaaataaa gttctcgaag 

49701 agaatacttc acggattatt tagacgagta 

49771 ccggacgacg aaactattct cgaccatttt 

49841 accttatcga caagctaaaa gaggagcatc 

49911 ggacattcaa gtagatagta acattgcgat 

49981 tctaaattcg taggcggact agacattgct 

50051 gaaaccatga cggtgaaaga cttggaatat 

50121 acttcctggt gaggatttga ttgtcataat 

50191 atgcttgcaa ctgcttggaa gaacgggcat 

50261 ttggtgctcg tatagatact attctttcga 

50331 ccatcagttc gaaaaatatg aggaccatat 

50401 acgcccttta tgattgrgagg aaagaacctt 

50471 catctgtggt ggggattgac cagctttcac 

50541 ccagtacgcc aacatcacca tggacctata 

50611 gtccaagcag ggcgttcggc taaaactgaa 

50681 atggagtagg tcaaaatgct agcagagtta 

50751 atctgtcgtt aaaaaccgat atggcgaaga 

50821 acctatactc ttataggatt caaagaggaa 

50891 aagcaaaagc ctctaggtcg actgctcgtc 

50961 atgaaagtaa atggtcttca aattgaagcg 

51031 aagacgaagg aacattcatt tttagacgaa 

51101 tcatgcagga gggactgaaa agcatccctc 

51171 gtgacggaag ctggaacggt tcactgtttc 

51241 atgtattagg tcgaaacgat ggagggttct 

51311 cgaagtagtt aggcaaggcg tcagccctga 

51381 aaaatcattc ctgaagagga acttgataaa 

514 51 cggacgagct caccgagatg tttgatgtag 

51521 gaacctcaag ggcgaaacag tattcttcaa 

51591 gatgacccta aaacggaatt tctttatggc 

51661 ctattagtca agtattcgtg actgagtctg 

51731 agtcgctctt atgggagtag gtggaggaaa 

51801 gttctagcac ttgaccctga taacgctggg 

51871 gcaaggtcgt tagatttttg aactacccta 

51941 ggaattatta aattctaatg atttagtctt 

52011 tttaaaaaga ggtcatatca ataCgaaaga 

52081 tggactgacg aagaatgtat caggaacttt 

52151 gttattttgg gacgctttat tcctatgcaa 

52221 tgcattcgag actatttcaa aatgtttggc 

52291 cttacaagac tcttcaagaa cagaatagtc 

52361 attggtatgt agaagtgacg ttcgatagcg 

52431 gacagttggc tattgtgaag actacggaaa 

52501 aatacagagt atgcttatat ctcgtctgtc 

52571 gtgaaattgg agtaagcagg tctgctatta 



gaacagacct tgaccatcgt ggttctcgat cctgggatga 

ggacaaccta atgcgcggtg acttggcact ctacactcga 

attcaagaag atgctaaagc atttaaacgt gaacatggat 

aacagttcat cgctgcaatt gaagaagccg gtgaattaaa 

caaaagtctt cgtgatgctc taaaagagta catgaaagaa 

ttttctgcta ccttctacac gacagagcgc tcaactatgg 

aattagttga cgaagccgag acggaagaaa tgtgtgaaaa 

catcaatacg aaacttctcg aggatatgat ttatcacggc 

gttgtcattt ctgttacaga aggcattcgt tttggaaagg 

acgtttttag ggttagcaga atccaatcac accacttgcg 

aggttaatat tataccataa ggaggagata agtggcaagg 

aatgaaattg .aactaacatt caaagacaag cctaaaactc 

caggtctttc aaaagtcgag catgattatt ttcaaatagt 

taatatgaag caggtgtcat ctttctttat agttcagtat 

tataactggt tcaacttttc gagcactatg aaaaatgttc 

aactttgtcg atttttagct gaaagttttg ttaaatatga 

aaggttcata acggtctcga ctttcaaaag agcctggatt 

aaattcgaag gattttatta gtttagtaga ctatttttag 

tcaataggcg tataatttat caatcttgat tctttcgggc 

ataaattata gtatcgaaaa tataaaaagg agaaaagttg 

aagttaaatg accttttcga gagaagtggg ctacctcaaa 

gaaaagccga caaggaatgt tgggaatggc tagaagctgt 

cggtcttagc attgttattg cttcgaatac tgtcgggaat 

caacgctatt tagcagaaac tgcacttgac ggaagaattg 

aactattgac tgagttcggc gactataatt attttcaaac 

ccttaagact tgtgagctat tagtcataga cgaaataggt 

tatctgtatg acttggttaa ttatagggtt gacaataact 

acgatgaaat tattgacctt ttaggccaaa ggctttatag 

ttttcaggca agcaatgtaa gaggattgga ggtaagcgaa 

tatttttctt tggcagattg tctttctttg tatttgctgc 

gagcgagagt ctcaagataa ggtgattcaa agttataagc 

tcgatagttc aggagcttgg ctaggaagtg ctccgggagc 

acagcatgta ggaaaattga aagaggtggg agagtgatac 

aaaagagctt atccatttta gaaaataatg gaattgacca 

tcaatttatt caagaacact tttcgagata tggaagagtt 

cctggattcg aatttttcga aattggcgaa actgatgaat 

tatataattc acttgttcca attttaacgg aagcggctga 

tgcgaatata attccaaaac tagaagaact tttcaatcgc 

cgaaatgcta aacttcgact agactgggcg aatactatta 

cgacagggtt tgaactattg gacgacgtgc ttggaggctt 

ggctcgacct ggacaaggta agtcgtggac tattgataaa 

gatgtccttc tatatagcgg ggaaatgagt gaaatgcaag 

atgttagcat caattcaatt accaaaggga tttggaacga 

tcaagcaatg actgaggctg aaaattccct tgtggtagtc 

acccctgcaa ttttagatag catgatatct aaatatagac 

tcatgagcga gtcttatcca agcagggagc agaagcgaat 

taagatttct gctaaatatg gaattcctat tgtgcttaat 

ggcgctgaaa gtatggaact agaacatata gcagaaagtg 

tcgctatgaa gcgtgacgaa aaatccggca tacttgaact 

ccgaaaaatc atcgaatata tgtgggacgt tgaaactgga 

ggcgaagaag gaactgaaaa aggcgaaagc tctccattga 

ttcgaagtaa ggttacaagg gaaggagttg aagcattttg 

actcctgaac aaataattga aaaactttcg agacaacttg 

ctaagtcgct tggaagcaac tatcaattct catgcccgtt 

ttgtggcatg agtagaaatc cttcttattc aggaagtaag 

acttgcggct acacttcagg actaactgaa ttcgtctcga 

atggaaacca gtggctgaaa aggaattttg gaacatctag 

agcgtttcga agaaatggga gaactgaaaa agtcgagcat 

taccggttta ttcatcctta tatgtatgaa cggaaattga 

gttatgacaa actgcatgat tgcatcacct ttccagtacg 

ccgtcgaagt gttcgttcta agtttcacca gtacggtgaa 

caatatgagc ttgtagcatt tcgagactat tttgaaaaac 

ttatcaactg cttgactctt tggtcaatga agattccagc 

tcaaatcaat ttactaaaac gacttcctta tagaaatatt 

cagacagcgc aggaaaaact ctaccgacag ttaaagcgaa 

aagagttcta tgataataag tgggatataa acgaccatcc 

gtagaaattc atttattatc gtataataaa gttagaaaat 

agcgaataga ctagtttcta gctatgtagg attcgaatgc 

gaactagacc ctgatatgtc aattgcgtct gcttatcatc 

aaaggtttaa atgcttatct cgacatgaca ttgaaagcat 

aacgttcaaa tcaaaccaag gggccaagtt ttcaacttac 

ttagaatata ggtacctaaa tgcaccttcc atgaatcgaa 

tttcgacaaa tgaagaaggc gacgatttta gtatcctatc 

aattgaaatt gaagcaagtc ttgacttcat gacgctttct 

attcaaaacg gtccttcagt aagcgacgca gaaattgcgc 

gtcagtctaa gaagtcacta aaaaataaat taaaagattt 
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52641 tatataactg gtttacaaat cacgtgaatt 

52711 aaaaacttca aaaatctttc aaccattaaa 

S2781 aaaaatcagg aacatttagc tcagggtcta 

52851 aatcgtcact ctattgtatg atgacccgga 

52921 gttgacggtc gtcgacgcta tatcaattgc 

52991 attgtccatt atgccaaaac ggattccctc 

53061 gggaaaagtt gaaacatggg accgaggccg 

53131 ggaagccttg tgactcagcc ttttgaaatt 

53201 aattccttcc agagcgtccg gaagacagtg 

53271 aactctaatt ttagacctcg acgaagacca 

53341 gagcgttctt caagtcgttc aaattcacgt 

53411 aatcttcaca aggtcgaaca gctgaaagaa 

53481 aggattctaa catgagggcg cgagccctct 

53551 gactctttgg tgcaaagcct cgttctagca 

53621 gaagcctgca gttgaggtta cttacatttc 

53691 ctttcaacta ggattcttgg acacgttctt 

53761 agtatgtaga caaaatgatt gaagacggaa 

53831 tcacgatgag ctggcaggag tctgcttgta 

53901 gttagcaata tgacgaagat gcgaattaag 

53971 ggattgtaga ttcaggaatt cctgtcatct 

54041 actcggcgtc aaaatgaatg agccagcgtg 

54111 tctcacagct tgaaaagtct tcactctaaa 

54181 atgacttatt taaaggaatt ccttttagtt 

54251 ccctttgcaa actttcgaac tctatgaatt 

54321 gaatataacc tggaaaaagt ctcatgggtt 

54391 acatggaagt ctacggtgtc gacttagacc 

54461 tatgaacgag gctgagcaag agtttcaaca 

54531 caaactaatt tccagagcta tcaaaaactc 

54601 gtcctactca attagcaatt ctgttttatg 

54671 aggaacaggc gaaagtattg tcgagcattt 

54741 tatgcaaaat tagtttcgac ctatacaaca 

54811 ctacattcaa acagtacgga gctaagacag 

54881 ttctcgcggt gagggtgcag tagttcgaca 

54951 gactactctc aacaagaacc tcgttcattg 

55021 . aacaaaacct ggacctatat tcagttatcg 

55091 gttctatccc gacggaacga ctaacaagga 

5 5161 ggtcttatgt acggccgcgg ggctaactca 

S5231 aggttattga agatttcttc accgagttcc 

55301 gcaggacttg ggatatgttc aaacagctac 

55371 tacgagttcg agtatatcga cgctagcaag 

55441 agatggacga tactgttcct gaacatatta 

5 5511 taagaagaag caagaaatta aagaccaggc 

5 5581 atagctgatg ctcagcgcca atgtttgaac 

5 5651 caatgattaa ggtacacaat gacgctgaat 

5 5721 tgagttacta ggtgaggttc ctatcaagaa 

55791 gaagcagcca aggacattat tagtcttcca 

55861 aagaaattga aatctaaaat ctattcagtt 

S5931 tttatttcga acctttaaat gtgaaaggaa 

56001 cctgcttata aatctaataa gcaagtacga 

56071 ttccttacct cgttgatttg ctttatgcaa 

56141 tttggataag tcaaaaagca agtgtcttta 

56211 aaatgctttg tctagcaaca tcggttctat 

56281 cttggaattg gaacggttgc atatatagat 

56351 tcttgcagtc aattgcttcg agatatttga 

56421 tgcttcgaga tatttgaaaa agtagtcagg 

56491 attcattcat tattat 



tcgtgtatac tatatatgaa aggacaaact ttgaaacctt 
aacttataaa ggagaatcga tatgggaaaa gtatcaattc 
ataacgagtt tttcacactc gctgaccacg gtgacagcgc 
aggcgaagac atggattatt tcgtagtcca cgaagcagac 
aatgctattg gcgaagacgg ggaaacagtc catcctgata 
gtattgaaaa actatttctt caactttaca accatgatac 
ttcttatgtt caaaagattg ttacatttat caataaatat 
attcgttcag gagctaaagg tgaccaacga actacttatg 
ctactcttga agattttcca gaaaagagcg aacttcttgg 
aatgtttgac gtggttgacg gcaagttcac tcttcaagaa 
agaggagcat ctcctgcgcc tagacgaggt tccggtcgag , 
ctccttcagt .tagtcgaaga actcctccaa cacgaggtcg 
ttattattga ttaagaaagg gaaaataatg gcacaaaaag 
agaagaacga tgctcagtta cttgctcaac ggaaaaacag 
aggaaacgct ctaaaggacg cagttgctag agctcgtact 
gatagacttg agttaatcac tgaggaagca aaactcgagc 
taggttctat tgacgtagaa actgatggac tcgatactat 
ctcacctagt caaaaaggaa tctatgctcc tgtcaatcat 
aatcaaattt ctcctgagtt catgaagaaa atgcctcaac 
atcataattc gaaatttgac atgaaatcga tttattggcg 
ggatacatat ttagccgcaa tgcttttaaa tgaaaacgag 
tatgttagga acgaagaaaa cgcagaggtt gcaaaattta 
taattcctcc tgatgttgcc tatatgtatg cggcctatga 
tcaagaacaa tacttgactc caggaactga acaatgtgaa 
cttcataata ttgagatgcc tctaattaaa gttctcttcg 
aagataagct ggcagaaatt agagaacagt ttactgccaa 
gcttgtcagc gaatggcagc ctgaaattga agaacttcga 
gaaatggatg caagaggtcg agtgacggta agcatttcca 
atatcatggg attgaaaagt cctgaaaggg ataaacctag 
tgataacgat atctcaaaag cacttttgaa atatagaaaa 
cttgaccaac accttgcaaa gcctgacaat cgaattcaca 
ggcgtatgtc aagtgagaat cctaacttac agaatattcc 
aatctttgca gccagtgaag ggcattacat tattggtagt 
gcggaattaa gtggcgacga aagtatgcga catgcttacg 
gttcgaaacc ttatggtgtt ccctatgaag agtgtttaga 
aggaaaactt cgaagaaatt ctgtcaagtc cgttctttta 
atcgctgagc agatgaatgt atctgtcaaa gaagcgaata 
ctaaagtggc agactatatc atattcgttc aacagcaggc 
cggtcgaaga agaaggcttc ctgatatgag tcttcctgaa 
aacgaagatt tcgacccctt taactttgac gcagaccaac 
tcgaaaaata ttgggcccag ctagatagag cctggggatt 
aaaagccgaa ggaattctta ttaaggataa cggaggcaag 
tcagttattc aaggaacggc agccgacatg actaagtacg 
tgaaagaatt aggattccat ttaatgattc cagttcacga 
cgcaaaacgg ggagcagaaa ggttgacaga agttatgatt 
atgaaatgtg accccagtat agtagaaaga tggtatggtg 
gcatatataa ttctagtagt tattgcgaac cttgtgacaa 
ttttaattcc tccaagcagt tggtttatgg gattcacttt 
gaagccaaaa tttgcaggtt ctttgatatg ggtagggtta 
aacctaccac aatcgcttgt cgtggcttca ggagttgcat 
tattcgacaa gccctcgaat aaattagact cgaagattgc 
tatagacgca accatatgga tttcattagg actgagtcct 
attccgtcag ccgtactagg ccaagttcta gttcagttta 
aaaagtagtc aggaaaattc ctgattatct tgcagtcaat 
aaaactcctg attatttttt ttacaaaaac gcttgacttt 



Fig. 7 

Abbreviations : 

kan: gene encoding kanamycin resistance 

cat: gene encoding chloramphenicol resistance 

qrx + and -: origin of replication in gram-positive arid 

gram-negative bacteria, respectively 

arsR: gene encoding regulatory -protein of the ars promoter 
P: ars promoter 

lucFF : gene encoding lucif erase protein. This portion will 
be removed and replaced by individual S. aureus phage 
genes. 

Reference: 

Tauriainen et al., Appl . Environ. Microbio. 1997. 63: 4456- 
4461. 
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Table 29 



Phage dpi ORFs list 




379 



64 


dp1ORF066 


-3 


28S66..28898 


110 




65 


dp1ORF067 


-1 


44735..45061 


108 




66 


dp1ORF068 


3 


29451..29768 


105 




67 


dp1ORF069 


-3 


20094.. 204 11 


105 




68 


dp1ORF061 


-3 


19161..19475 


104 




69 


dp1ORF070 


1 


15973.. 16284 


103 




70 


dp1ORF071 


3 


38904.. 39209 


101 




71 


dp1ORF072 


-2 


50749..51045 


98 




72 


dp1ORF073 


3 


14262..14555 


97 




73 


dp1ORF074 


3 


32298..32591 


97 




74 


dp1ORF075 


-1 


22154..22447 


97 




75 


dp1ORF076 


-1 


543S..5728 


97 




76 


dp1ORF077 


1 


14800.. 15084 


94 




77 


dp1ORF079 


-3 


35007..35288 


93 




78 


dp1ORF081 


-3 


55188..55466 


92 




79 


dp1ORF103 


2 


49352..49627 


91 




80 


dp1ORF080 


1 


42490..42759 


89 




81 


dp1ORF082 


1 


44728..44994 


88 




82 


dp1ORF083 


-1 


35720.35974 


84 




83 


dp1ORF065 


-3 


51246..51497 


83 




84 


dp1ORF085 


-3 


10602.. 10847 


81 




85 


dp1ORF087 


-2 


29794..30036 


80 




86 


dplORF088 


3 


5040..5279 


79 




87 


dp1ORF089 


-2 


12256.. 12495 


79 




88 


dp10RF273 


3 


56256..S6486 


76 




89 


dp1ORF078 


-3 


17280..17507 


75 




90 


dp1ORF090 


1 


27037..27261 


74 




91 


dp1ORF091 


1 


43189..43413 


74 


Holin; 


92 


dp1ORF092 


3 


46989..47213 


74 




93 


dp1ORF093 


-2 


4553S..45756 


72 




94 


dp1ORF095 


3 


8877..9089 


70 




95 


dp1ORF096 


-1 


46469..46681 


70 




96 


dp1ORF097 


-1 


38888..39100 


70 




97 


dp1ORF098 


1 


43627..43S36 


69 




98 


dp1ORF099 


3 


38298..38507 


69 




99 


dp1ORF100 


1 


1597.. 1803 


68 




100 


dp1ORF101 


2 


19220.. 19426 


68 




101 


dp1ORF094 


1 


8281. .8484 


67 




102 


dp1ORF102 


2 


4034..4237 


67 




103 


dp1ORF104 


-1 


21224..21427 


67 




104 


dp1ORF105 


-2 


1828..2028 


66 




105 


dp1ORF106 


-3 


10329.. 10529 


66 




106 


dp1ORF108 


-1 


49250.. 49447 


65 




107 


dp1ORF109 


-2 


31435..31632 


65 




108 


dp1ORF110 


1 


16444.. 16638 


64 




109 


dp10RF111 


1 


28657..28851 


64 




110 


dp10RF113 


-2 


17521..17715 


64 




111 


dp1ORF084 


1 


15445.. 15636 


63 




112 


dp10RF114 


2 


52952..53143 


63 




113 


dp10RF115 


-3 


5151. .5342 


63 




114 


dp10RF116 


-1 


20474..20662 


62 




115 


dp10RF117 


-3 


24492..24680 


62 




116 


dp10RF118 


2 


15023..15208 


61 




117 


dp10RF119 


2 


41054..41239 


61 




118 


dp1ORFl20 


1 


2S387..28569 


60 




119 


dp10RF121 


3 


39222..39404 


60 




120 


dp10RF122 


-1 


40220..40402 


60 




121 


dp10RF123 


-2 


21145.21327 


60 




122 


dp10RF124 


-3 


17712..17891 


59 




123 


dp10RF125 


-3 


49740..49916 


58 




124 


dp10RF126 


-3 


15960..16136 


58 




125 


dp10RF127 


-3 


13335..13511 


58 




126 


dp10RFl28 


1 


4852.. 5025 


57 




127 


dp10RFl29 


2 


25133..25306 


57 




128 


dp1ORF130 


-1 


.16619..16789 


56 




129 


dp10RF131 


1 


43846..44013 


55 




130 


dp10RFl32 


-1 


151 37.. 15304 


55 




131 


dp10RF133 


-2 


7900..8061 


53 




132 


dp10RF135 


3 


780.. 938 


52 




133 


dp10RF136 


-1 


55094..55252 


52 




134 


dp10RF137 


-2 


36988..37146 


52 
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135 


dp10RF138 


-3 


30504.. 30662 


52 




136 


dp10RF139 


•3 


11934.. 12092 


52 




137 


dp1ORF140 


3 


20562..20717 | 


51 




138 


dp10RF141 


-1 


42767..42922 


51 




139 


dp10RF142 


-3 


31743..31898 


51 




140 


dp10RF143 


-3 


74 10.. 7565 


51 




141 


dp10RF144 


1 


36517..36669 


50 




142 


dp10RF145 


1 


42067..42219 


50 




143 


dp10RF146 


1 


51484..51636 


50 




144 


dp10RF147 


1 


55207..55359 


50 




145 


dp10RF148 


-1 


28484..28636 


50 




146 


dplORF150 


-3 


15033..15185 


50 




147 


dp10RF134 


-2 


349..498 


49 




148 


dp10RF151 


1 


28027..28176 


49 




149 


dp10RF152 


1 


42235..42384 


49 




150 


dp10RF153 


2 


22307..22456 


49 




151 


dp1ORF086 


2 


52760..52906 


48 




152 


dp10RF154 


2 


18446.. 18592 


48 




153 


dp10RF155 


3 


13512.. 13658 


48 




154 


dp10RF156 


3 


18777..18923 


48 




155 


dp10RF157 


-2 


13135..13281 


48 




156 


dp10RF158 


-3 


40581. .40727 I 


48 




157 


dplORF159 


-3 


30225..30371 


48 




158 


dp10RF149 


-3 


26331. .26474 


47 




159 


dp1ORF160 


2 


41324..41467 


47 




160 


dp10RF161 


2 


52175..52318 


47 




161 


dp10RF162 


3 


13020..13163 


47 




162 


dp10RFl63 


3 


40224.. 40367 


47 




163 


dp10RF164 


-2 


6S53..6696 


47 




164 


dp10RF165 


-3 


50361. .50504 


47 




165 


dp10RF166 


-3 


23376..23519 


47 




166 


dp10RF167 


3 


1008.. 1148 


46 




167 


dplORF168 


-2 


54205..54345 


46 




168 


dp10RF169 


-2 


45814..45954 


46 




169 


dp1ORF170 


-2 


27460..27600 


46 




170 


dp10RF171 


-3 


47538.-47678 


46 




171 


dp10RF172 


-1 


10325..10462 


45 




172 


dp10RF173 


-2 


32023..32160 


45 




173 


dp10RF174 


-2 


29629.-29766 


45 




174 


dp10RF175 


-2 


15511..15648 


45 




175 


dp10RF176 


-3 


42894..43031 


45 




176 


dp10RF177 


-3 


19800.. 19937 


45 




177 


dp10RF178 


-3 


11787..11924 


45 




178 


dp10RF112 


2 


32207..32341 


44 




179 


dp10RF179 


3 


56058..56192 


44 




180 


dp1ORF180 


-1 


41042..41176 


44 




181 


dp10RF181 


-1 


12992..13126 


44 




182 


dp10RF182 


-2 


45235..4S369 


44 




183 


dp10RF183 


-2 


13762..13896 


44 




184 


dp10RF184 


-3 


53196..53330 


44 




185 


dp10RF185 


1 


22522-22653 


43 




186 


dp10RF186 


2 


21272..21403 


43 




187 


dp10RF187 


2 


34415..34546 


43 




188 


dp10RF188_ 


2 


35609.. 35740 


43 




189 


dp10RF189 


2 


42587..42718 


43 




190 


dp1ORF190 


3 


39786..39917 


43 




191 


dp10RF191 


-1 


40865..40996 


43 




192 


dp10RF192 


-1 


2789..2920 


43 




193 


dp10RF193 


-2 


42325.-42456 


43 




194 


dp10RF194 


-2 


40153..40284 


43 




195 


dp10RF195 


-3 


42453..42S84 


43 




196 


dp10RF196 


-3 


11142..11273 


43 




197 


dp1ORF107 


1 


10750.. 10878 


42 




198 


dp10RF197 


2 


7484..7612 


42 




199 


dp10RF198 


2 


241 19..24247 


42 




200 


dp1QRF199 


-1 


15514.. »Of4z 


AO 




201 


dp1ORF200 


-3 


47715..47843 


42 




202 


' dp1ORF201 


1 


38569-38694 


41 




203 


dp1ORF202 


2 


44483..44608 


41 




204 
205 


dp1ORF203 
dp1ORF204 


-3 
1 


22656..22781 
1471..1593 


41 
40 
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206 


dp1ORF205 


1 


8S24..8646 


40 




507 


dp1ORF206 


1 


19855.. 19977 


40 




908 


dp1ORF207 


1 


27502..27624 


40 




90Q 


dp1ORF208 


2 


47279..47401 


40 




?m 

i \ u 


do1ORF209 


3 


29784.. 29906 


40 




91 1 


do1ORF210 


-1 


52955..53077 


40 




919 


do10RF211 


-1 


20837..20959 


40 




213 


dp10RF212 


•2 


52861. .52983 


40 




214 


dp10RF213 


-2 


30169..30291 


40 




91 5 


rio10RF214 


-2 


24151..24273 


40 




216 


dp10RF215 


-3 


35700..35822 


40 




217 


dp10RF216 


-3 


32727.. 32849 


40 




£ 1 0 


Hn10RP217 


\ 


23443..23S62 


39 




ion 
22U 

OO A 

221 


Hn10RP91fl 
Qp 1 \Jr\rc. l O 

Hn1HRP9lQ 
OpiVJr\rZ 13 

□ P 1 \Jr\r £.£.\J 


3 
_\ 
_1 


22029 22148 
51269 51386 
6215 6334 


39 
39 
39 




222 
223 
224 


dp10RF221 
dp10RF222 
dp10RF223 


1 
3 
3 


43507..43623 
13212.. 13328 
14055.. 14171 


38 
38 
38 




225 


dp10RF224 


-1 


13505..13621 


38 




226 


dp10RF225 


-2 


32875..32991 


38 




227 


dp10RF226 


-2 


25075..25191 


38 




OOQ 
OOQ 

nn 
2sJU 


Hn1HRP997 

rln-!ORP99ft 

HninRF99Q 
up I urvr £.£.^3 


_2 
1 


22999..23115 
10450 10563 
27634 27747 


38 
37 
37 




oo < 

2J i 
232 
233 
234 


QpHJrxrZOU 

dp10RF231 

uplUKr2J2 

dp10RF233 


o 

-2 
-3 


50723 50836 
30958..31071 
29272 29385 
52779..S2892 


37 
37 
37 
37 




235 


aplUr\rZJ*» 


1 


36253 36363 


36 




236 


dp10RF235 


2 


3276S..32878 


36 




237 


dp10RF236 


-1 


37418..37528 


36 




238 
239 


dp10RF237 
dp10RF238 


-1 
•3 


1568..1678 
1191. .1301 


36 
36 




240 
241 


dp10RF239 
dp1ORF240 


1 
1 


26521. .26628 
41893..42000 


35 
35 




242 


dp10RF241 


-1 


46913..47020 


35 




243 

244 

245 

246 

247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 

259 

260 

261 

262 

263 

264 

265 

266 

267 

268 

269 

270 

271 

272 

273 


dp10RF242 

dp10RF243 

dp10RF244 

dp10RF245 

dp10RF246 

dp10RF247 

dp10RF248 

dp10RF249 

dp1ORF250 

dp10RF251 

dp10RF252 

dp10RF253 

dp10RF254 

dp10RF255 

dp10RF256 

dp10RF257 

dp10RF258 

dp10RF259 

dp1ORF260 

dplORF261 

dp10RF262 

dp10RF263 

dp10RF264 

dp10RF265 

dp10RF266 

dp10RF267 

dp10RF268 

dp10RF269 

dp1ORF270 

dp10RF271 

dp10RF272 


-1 
•2 
-3 
-3 
•3 
1 
1 
2 
2 
-1 
-2 
-3 
-3 
-3 
1 
1 
1 
2 
2 
3 
3 
-1 
-1 
-1 
-2 
-2 
-2 
-3 
-3 
-3 
-3 


41231..41338 
51199..51306 
26976..27083 
61 71. .6278 
2724..2831 
29641. .29745 
53560..53664 
2012.2116 
23837..23941 
391 01. .39205 
54667..54771 
561 51 ..56255 
4837S..48479 
9468..9572 
15289..15390 
28216..28317 
44023..44124 
4298..4399 
24746.-24847 
288.-389 
9408.. 9509 
26951. .27052 
6038..6139 
4700..4801 
50119..50220 
47266.-47367 
12520..12621 
53733-53834 
50691. .50792 
19638..19739 
1455..1556 


35 
35 
35 
35 
35 
34 
34 
34 
34 
34 
34 
34 
34 
34 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
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Table 30 



Predicted Dp-1 amino acid sequences 

dplORFOOl 

36698 atgattgacaataatttacctatgagtccaattcctggcgaaattgttcaagtatatgaccaaaacttcaatctaattggagca 

1 M I DNNLPMS P I PGEIVQVYDQN FNLIGA 

36782 agtgatgaaatctttagcaagcattacgaagacgaaattgtgactcgagctcgaggaaaagaaactttcacttttgaaagtatt 

29 S DEIFSKHYEDEIVT.. RARGKETFTFESI 

36 8 66 gaaacctcatctatctatcaacacttaaaggttgaaaacattatccagtatggaggaagatggtttcgaattaaatatgctcag 

57 ETS S I YQHLKVEN I IQY GGRWFRI KYAQ 

36 9SO gacgtagaagatgtcaaagggcttaccaagtttacctgctacgcattacggtatgaactagcagaaggcttgcctaggaagttg 
85 DVEDVKGLTKFTCYALWYELAEGLPRKL 
37034 aaacacgttgcttcttctgtaggcgctgtcgcgctagatattatcaaagacgcaggtgaatgggttcgactagtttgtcctcct 
113 KHVASSVGAVALD I IKDAGEWVRLVCPP 
37118 gacggtgctaacaaacaagttcgaagcataacagccgcagaaaattcaatgctttggcatcttcgatatcttgcaaagcaatac 
141 DGANKQVRS ITAA ENSMLWHLRYLAKQY 
37202 aatttagaattgacatttggttatgaagaaattatcaagcaagaggttagaattgttcaaaccgttgtatttcttcagccttat 
169 NLELTFGYE E I I KQEVRIV QTVVFLQPY 
37286 gtcgagtctaaagtagactttcctcctgtagttgaagagaatttgaaatatgtcactaggcaggaagattctcgaaacctgtgt 
197 VESKVDFPLVVEENLKYVTRQEDSRNLC. 
37370 acggcttacaagttgacaggtaaaaaggaagaaggcagtcaagagcctttaacgtttgcttctatcaacaatggaagtgaatat 
225 TAY KLTG KK EEG S Q E P LT FAS I MNG S EY 
374 54 ctcattgatgtttcgtggtttactacacgccacatgaagcctcgatatattgctaaatctaaaagcgacgaacattttagaatt 
253 LIDVSW FTTRHMKPRYIAKSKSDEHFRI 
37538 aaagaaaatttgatgagtgctgcgcgtgcttatcttgacatctacagtcgcccactaattggatatgaggcttcagcggtcctt 
281 KEN LMS AARAY LD I Y S R P L I GYEASAVL 
37622 tataacaaggttcctgacttgcatcatactcaactaattgtcgacgaccattatgatgttatcgagtggcgaaagatatctgct 
309 YNKV PDLHHTQLIVDDHYDVIEWRKISA 
37706 cgaaaaattgactacgacgacctttcaaactctactaccattttccaagaccctcgaaaagacttgatggacttgctaaatgag 
337 RKI DYDD LSNST I I FQD PRKDLMDLLNE 
37790 gacggcgaaggagtcctttcaggggaaactgtaaatgagtcccaagttgttattagatacgcagatgacattttagggactaat 
365 DG EGVLS G E TV NE SQVV I RYADD I L GTN 

37 874 tttaatgcagaatctgggaaatacattggtgtccttaatactaataagaaaccgagcgaattagttcctgacgactttacatgg 
393 FNAESGKYIGVLNTNKKPS ELVPDDF TW 

37 958 attcgactagaaggtcctaaaggtgacgcaggtttaccgggagctcctgggcgtgatggagtcgacggtgtacctggaaagagc 
421 I RLEG PKGD AGLPGAPGRDGVD GVPGKS 

38 042 ggagtagggatagcagatacagctatcacttatgctgtatccgtttccggaacgcaagagcctgaaaatggatggagcgaacaa 
449 GVG I ADTA I TYAV SVSGTQE PENGWSEQ 
38126 gttcctgaactcataaaaggtcgattcttgtggactaaaacattttggagatatactgacggctcacatgaaactggatactcc 
477 V P E L I KG R F LWTKT FWRY TDGSHETGYS 
38210 gttgcctatatagggcaagacggaaattccggaaaagacggaatcgcaggtaaggacggagtaggtatagccgcaactgaagtc 
505 VAY IGQDGN SGKDG IAG KDGVGIA'ATEV 
38294 atgtatgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtat 
533 MYASS PSAT EAPAGGWSTQV PTVPGGQY 
3837 8 ttatggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcct 
561 LWTRTRWRYTDQTD EIGYSVSRMGEQGP 
384 62 aaaggtgacgcaggtcgtgacggtattgcaggaaagaacggaatagggttgaagtcaacttcagtttcttatggaattagtccc 
589 K GDAGRDG I AGKNG IGL KS TSVS.YG I S P 
38546 actgattctgcgattcctggagtatgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttgg 
617 TD S AI PG V'WAS QV P S L I KGQYLWTRT I W 
38630 acctataccgattcaactaccgaaacgggctatcaaaaaacctacattccaaaagacgggaatgacggtaaaaatggaattgct 
645 TYTDS TTETGYQKTYI PKDGNDGKNGIA 
38 714 ggtaaggatggggtaggaattaagtctacgaccattacctacgcaggctcaacctcaggaacagttgcgcctacttcaaattgg 
673 GKDGVG I KSTTITYAGSTSGTVAPTSNW 
387 98 acttctgctattccaaatgttcaaccgggattcttcttgtggacgaaaactgtttggaactatactgatgacaccagcgaaaca 
701 TSAIPNVQPGFFLWTKTVWNYTDDTSET 
38 882 ggttactcagtttccaagataggcgaaacaggccctagaggagttcaaggtcttcaaggccctcaagggcttcaaggaattcct 
729 GYSV SKIGETGPRGVQGLQGPQGLQGIP 
38 966 ggacctgcaggagctgacggacgttcgcaatatactcacctcgctttctctaatagtccaaacggtgagggatttagtcatact 
757 G PAGADGRS QYTHLAFSNS PNGEG FSHT 
39050 gacagcggacgagcatacgtcggtcagtatcaagatttcaatcccgtccattcaaaagaccctgcagcctatacatggacgaaa 
785 D SGRAYVGQYQDFNPVHSKDPAAYT WTK 
39134 tggaaggggaatgacggagctcaagggatacccgggaagccaggcgcagacggtaagactaattatttccatatagcttacgct 
813 WKGND GAQGI PGKPG ADGKTNYFHIAYA 
39218 tcaagtgcagacggatcacgtgagttcagtttggaagataataatcaacaatatacgggttattactccgattatgagcaagca 
841 SSADGSREFSLEDNNQQYMGYYSDYEQA 
39302 gatagcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattct 
869 DSRDRTKYRWFDRLANV QVGGRNEFLNS 

393 86 ttatttgaatttggtttaaaacctcgctattctagttacaatctaatggacggacaagatcaaacgcaaggacagatatctgct 
897 LFEFGL KPRY SSYNLMDGQDQTQ.GQISA 

394 70 actattgacgaacgtcaacggttcaaaggtgctaactctttacgacttgactcaacatggaacggtaaaccgcagaaccaaaaa 
925 T I D ERQR F KGANS LRLD STWN G KPQNQK 
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3 9554 ctgaccttttctttaggaggagatacgcgattaggtactccaaccgagtggtctaatttagaaggtcgtatcagtttctgggct 

953 LTFSLGGDTRLGTPTEWSNLEGRISFWA 

3 9638 aaggcctctaggaacggagtgagcttagctgcacggccgggttatcgtagtaacgtatttaccgcaaccttaaccgatcaatgg 

981 KASRN GVSLAARP GYRSNVFTATLTDQW 

3 9722 aagttctacgattttaaattctttgacaaagttaattcaaattgtaccgctgaagcaattttccatgtattcactcaaagttgt 

1009 KFYDFKFFDKVNSNCTAEAIFHVFTQSC 

3 9806 tcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcagaggaagaccttaaatatcga 

1037 SVWLNH I KIELGNISTPFSEA EEDLKYR 

3 9890 attgactcaaaagccgatcaaaagctaactaaccaacagttgacggcactcacggaaaaggctcaactacatgacgcagaactg 

1065 IDS KADQKLTNQQLTALTEKAQLHDAEL 

3 9974 aaagctaaggctacaatggagcagttaagtaacttagaaaaggcttatgaaggtagaatgaaagctaatgaagaagctatcaaa 
1093 KAKAT M E Q L SNLE KA A YEGRMKANE EA I K 

4 0058 aaatcggaagccgacctaatcttagcggcaagtcgaattgaagctactatccaagaacttggcgggctacgggaactgaagaag 
1121 K S E A D L I LAASRI EAT IQE LGGLRELKK 

4 0142 ttcgtcgacagttacatgagctcttctaatgaaggtctaattatcggtaagaacgacggtagctctaccattaaggtatcaagt 

1149 FVDSYMS SSNEGLI IGKNDGSSTIKVSS 

4 02 26 gaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatctttacc 

1177 DR I S M F S AGNEVMY LTQG F IH I DNG.I FT. 

4 0310 caatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatgaacgtgattcggtatgtaggataa 
40390 

1205 QS I QVG R F RTEQYS FNPDMNVIRYVG * 
dplORF002 

32386 atggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtcaagcg 

1 MDFGS I.AAKMTLDI'SNFTSQLNLAQSQA 

324 70 caacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgcggtt 

29 QRLALES SKSFQIGSALTGL. GKGLTTAV 

3 2554 acccttcctcttatgggatttgcagccgcctctattaaagtagggaatgaattccaagctcaaatgtcccgtgttcaagctatt 

57 TLPLMGFAAAS I KVGNEFQAQMSRVQAI 

3 2638 gcaggagcgacagcggaagagcttggtagaatgaagactcaagcaatcgaccttggtgctaaaactgcttttagtgcaaaagag 

85 AG. ATA E E LG RMKTQAI D LGAKTAFSA KE 

32722. gcggctcaaggtatggaaaatctagcttcagccggtttccaggtaaatgaaatcatggacgctatgccaggggtacttgacctg 

113 AAQGM EN LASAGFQVNE IMDAMPGVLDL 

3 2806 gctgccgtatctggaggagatgtggccgcgagctccgaggccatggctagttcacttcgagcctttggattagaggcaaaccag 

141 AAVSGGDVAASSEAMASSLRAFGLEANQ 

3 2890 gcgggtcacgtggctgacgtatttgctcgagcagcagctgatacgaacgcagaaactagcgacatggcagaggcgatgaaatac 

169 AGHVAD V FARAAAD TNAETSDMAE AM KY 

32974 gtcgcacccgttgctcactctatgggcttgagccttgaagaaacggctgcgtctattgggattatggccgacgccggtattaag 

197 VA P VAH S MG LS LEETAA5 I GI MAOAG I K 

3 3058 ggctcgcaagccggaaccacgcttagaggcgctctctcgcgtattgccaaacctacgaaagcgatggtcaaatcaatgcaggaa 
225 GSQAGTTLRGALSRIAKP .TKAMVKSMQE 
33142 ttaggagtttcgttctacgacgcgaacggaaacatgattccactaagagaacaaatcgctcaactgaaaacagctactgcagga 
253 LGVSFYDANGMMI PLREQIAQLKTATAG 
33226 ctaacacaagaggaacgaaatcgtcaccttgttaccttgtatggccaaaactcgttgtcaggtatgcttgcactattagacgca 
281 LTQ E E RN RHLVTLYG QNS LSGMLAL LDA 
33310 ggtcctgagaaattggataagatgaccaatgctctcgtgaactcggacggagctgctaaggaaatggcagaaactatgcaggac 
309 GPEKLDKMTNALVNSDGAAKEMAETMQD 
33394 aaccttgctagtaaaatcgagcaaatgggaggagctttcgagtctgttgctattattgttcaacaaatccttgagcctgcactt 
337 NLASKI EQMGGAFESVAI IVQQILEPAL 
334 78 gctaaaatcgtgggagcaatcacaaaagttctcgaagcattcgtaaatatgtcacctatcggtcaaaagatggttgtcatattc 
365 A K I VGA I T KVLEAF VNMS P IGQ KMVVI F 
33562 gcaggaatggttgcagcccttggaccactgcttctaattgcaggaatggtgatgacaactattgtcaagttaagaattgctatt 
393 AG M VAA L G P L L L I A G MVMT T I V K L R I A I 
33646 cagtttttaggtccagcatttatgggaacgatgggaaccattgcaggagttatagcaatactctatgctctggtcgccgtgttc 
421 QFLG PA FMGTMGTIAGVIAIFYALVAVF 
33730 atgatagcctacacaaaatcggagagatttagaaactttatcaacagtcttgcgcctgctattaaagctgggtttggaggagcg 
449 MIAYTKS ERFRNF INSLAPAI KAGFGGA 
33814 ttggaatggctacttccacgactgaaagagttaggagaatggttacagaaggcaggcgagaaggcgaaagagttcggtcagtct 

4 77 LEWLL PRLKELGEWLQ KAGEKAKEFGQS 
33 898 gtagggtctaaagtgtcaaaactgctcgaacagtttggaataagtatcggtcaggcaggaggctcgattggtcagttcattgga 
505 VGS KVS KLLEQFGIS IGQAGGSIGQFIG 

33 982 aatgttctcgaaaggctaggaggcgcatttggaaaagtaggaggagtcatttcaattgctgtttcacttgtaacaaaattcggt 
533 NVL ERLGGAFGKVGGVI S IAVSLVTKFG 

34 066 ctcgcatttctagggattacaggaccactcgggattgctattagtctgttagtttcatttttgacagcttgggctagaacaggt 
561 L A F LG I T G P LG Z AI S L L V S FLTAWARTG 
34150 gagttcaacgcagacggaatt act caagtattcgaaaacttgacaaacacaattcagtcgacggctgatt teat etc tcaatac 
589 EF NADG I TQVFEMLTNT IQSTADFI SQY 
34 234 cttccagtctttgtcgaaaaaggaactcaaattttagttaagattattgaaggaattgcatctgctgttcctcaagtagttgaa 
617 LPVFVEKGTQILVKI IEGIASAVPQVVE 
34 318 gtgatttcacaagtcattgaaaatattgtgatgacaatttcgacagttatgcctcaattagtcgaagcaggaattaagatactc 
645 VISQVIENIVMTISTVMPQLVEAGIKIL 
34402 gaagcgcttataaatggtcttgttcaatctcttcctactatcattcaagcagctgttcaaattatcactgctttattcaatggt 
673 EALINGLVQSLPTIIQAAVQI ITALFNG 
344 86 cttgttcaggcacttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataaacggactagttcaagcgcttccg 
701 L V. Q A L P T L I QAG LQ I L S A L I N G L V Q A L P 
34 570 gcaattattcaagcagctgttcaaattatcatgtcgcttgttcaagcactaattgaaaacttgcctatgataatcgaagcagcg 
729 AI IQAAVQI IMSLVQA .LIENLPMI IEAA 
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34 654 atgcagattataatgggtctagtcaacgcactgattgaaaatataggacctatcttagaagcagggattcaaattctaatggct 

757 MQ I IMGLVNAL I EN I G P I LEAG I Q I L M A 

34738 ttaatcgagggacttattcaagtgcttcctgaactaattacagcagcgattcaaatcattacttcactattagaagcaatcttg 

785 LI EGLIQVL .PELIT AAIQI ITSLLEAIL 

34 822 tcgaaccttcctcaacttctagaagccggagttaaattgcttttatcacttcttcaagggttgctaaatatgcttcctcaacta 

813 SNLPQL'LEAGVKLLLSLLQGLLNMLPQL 

34 906 attgcaggggctttgcaaatcatgatggcacttcttaaagcagttatcgacttcgtccctaaacttcttcaagcaggtgttcaa 

841 IAGALQIMMALLKAVIDFVPKLLQAGVQ 

34 990 cttcttaaggcattgattcaaggtattgcttcacttctcggctcacttttatcgacagctggaaacatgctt teat cat tagtt 

869 LLKAL IQGIASLLGSLLSTAGNMLSSLV 

35074 agcaagattgctagctttgtgggacagatggtttcaggaggtgcgaacctgattcgaaacttcattagtggtattgggtcaatg 

897 S K I AS PVGQM'vSGGA a ML I RN F I SG I GSM 

35158 attggttcagctgtctctaaaattggcagcatgggaacttcaattgtttctaaggttactggattcgctggacaaatggtaagc 

925 IGSAVSKIGSMGTS IVSKVTGFAGQMVS 

35242 gcaggggtcaaccttgttcgaggatttatcaatggtatcagttccatggtaagttctgcggtaagtgcggcggctaatatggct 

953 AG V N I* V R G F I NGI S S MVS SAVSAAA NMA 

35326 agcagtgcattaaatgccgttaagggattcttaggtattcactctccttcacgtgtcatggagcagatgggtatctatacgggt 

981 S S ALNAVKGFLGIHS PS RV MEQMGIYTG 

35410 caagggttcgtaaatggtattggtaacatgattcgaactacacgtgacaaggctaaagaaatggctgaaactgttactgaagct 

1009 QG FVMG I GNM I RTT R D KAKEMAETVTEA 

354 94 ctcagcgacgtgaagatggatattcaagaaaatggagttatagaaaaggttaaatcagtttacgaaaagatggctgaccaactt 

1037 LS DVKMD I QENGVI EKVKS VYEKMADQL 

35578 cctgaaactcttccagctcctgatttcgaagatgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagt 

1065 PETLPAPDFEDVRK AAGS PRVDLFNTGS 

3 5662 gacaaccctaaccaacctcagtcacaatctaaaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtagttcga 

1093 DM PN Q PQ SQS KNNQGEQTVVN I GTIVVR 

3574 6 aacaatgacgacgttgacaaactgtcgagaggattgtataatagaagtaaagaaactctatcagggtttggtaacattgtaaca 

1121 NMDDVDKLSRGLYNRSKETLSGFGNIVT 

35830 ccgtaa 35835 

1149 P * 
dplORF003 

53 538 atggcacaaaaaggactctttggtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacagg 

1 MAQK GLFGAKPRSS KKNDAQLLAQRKNR 

53622 aagcctgcagttgaggttacttacatttcaggaaacgctctaaaggacgcagttgctagagctcgtactctttcaactaggatt 

29 KPAVEVTYISGNALKDAVARARTLSTRI 

53 706 cttggacacgttcttgatagacttgagttaatcactgaggaagcaaaactcgagcagtatgtagacaaaatgattgaagacgga 

57 LGHVLDRLELITEEAKLEQYVDKMIEDG 

53790 ataggttctattgacgtagaaactgatggactcgatactattcacgatgagctggcaggagtctgcttgtactcacctagtcaa 

85 IGSIDVETDGLDTIHDELAGVCLYSPSQ 

53 874 aaaggaatctatgctcctgtcaatcatgttagcaatatgacgaagatgcgaattaagaatcaaatttctcctgagttcatgaag 

113 KG IYA PVNHVSNMTKMRI KNQI S PEF MK 

53 958 aaaatgcttcaacggattgtagattcaggaattcctgtcatctatcataattcgaaatttgacatgaaatcgatttattggcga 
141 KMLQRIVDSGIPVIYHNSKFDMKSIYWR 

54 042 ctcggcgtcaaaatgaatgagccagcgtgggatacatatttagccgcaatgcttttaaatgaaaacgagtctcacagcttgaaa 
169 LGVKMNE PAWDTYLAAMLLNENESHSLK 
54126 agtcttcactctaaatatgttaggaacgaagaaaacgcagaggttgcaaaatttaatgacttatttaaaggaattccttttagt 
197 SLHSK .Y VRNEEMAEVAKFNDLFKGIPFS 
54210 ttaattcctcctgatgttgcctatatgtatgcggcctatgaccctttgcaaactttcgaactctatgaatttcaagaacaatac 
225 LI PPDVAY MYAAYDP LQTFELY .EFQEQY 
54294 ttgactccaggaactgaacaatgtgaagaatataacctggaaaaagtctcatgggttcttcataatattgagatgcctctaatt 
253 LT PGTEQCEEYNLE KVSWVLHN I EMPLI 
54378 aaagttctcttcgacatggaagtctacggtgtcgacttagaccaagataagctggcagaaattagagaacagtttactgccaat 
281 KVLFDMEVYGVDLDQDKLAE I REQFTAN 
54462 atgaacgaggctgagcaagagtttcaacagcttgtcagcgaatggcagcctgaaattgaagaacttcgacaaactaatttccag 
309 MNEAEQE FQQLVSEW QPE I EELRQTNFQ 
54 546 agctatcaaaaactcgaaatggatgcaagaggtcgagtgacggtaagcatttccagtcctactcaattagcaattctgttttat 
337 S YQKL EMDARGRVTV S I S S PTQLAI L. FY 
54630 gatatcatgggattgaaaagtcctgaaagggataaacctagaggaacaggcgaaagtattgtcgagcattttgataacgatatc 
365 DIMGLKS PERDKPRGTGESIVEHFDNDI 
54714 tcaaaagcacttttgaaatatagaaaatatgcaaaattagtttcgacctatacaacacttgaccaacaccttgcaaagcctgac 
393 SKALLKYRKYAKLVSTYTTLDQHLAKPD 
54798 aatcgaattcacactacattcaaacagtacggagctaagacagggcgtatgtcaagtgagaatcctaacttacagaatattcct 
421 NRIHTTFKQYGAKTGRMSSENPNLQNIP 
54 882 tctcgcggtgagggtgcagtagttcgacaaatctttgcagccagtgaagggcattacattattggtagtgactactctcaacaa 
449 SRGEGAVVRQI FA AS EGHYI IGSDYSQQ 
54 966 gaacctcgttcattggcggaattaagtggcgacgaaagtatgcgacatgcttacgaacaaaacctggacctatattcagttatc 
477 E P RSLAE LSGDE SMRHAYEQNLDLYSVI 
55050 ggttcgaaactttatggtgttccctatgaagagtgtttagagttctatcccgacggaacgactaacaaggaaggaaaacttcga 
505 GSKLYGVPYEECLEFYPDGTTNKEGKLR 
55134 agaaattctgtcaagtccgttcttttaggtcttatgtacggccgcggggctaactcaatcgctgagcagatgaatgtatctgtc 
533 RNSVKSVLLGLMYGRGANS IAEQMNVSV 
55218 aaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttcaacagcaggcg 
561 KEAMKVI EDFFTEFPKVADYI IFVQQQA 
55302 caggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatgagtcttcctgaatacgagttcgagtat 
589 qdlgyvqtatgrrrrlpdmslpe'yefey 
55386 atcgacgctagcaagaacgaagatttcgacccctttaactttgacgcagaccaacagatggacgatactgttcctgaacatatt 
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617 
55470 
645 
55554 
673 
55638 
701 
55722 
729 
5S806 
7S7 

dplORF004 

40401 

1 

40485 
29 

40S69 
57 

40653 
85 

40737 

113 

40821 

141 

40905 

169 

40989 

197 

41073 

225 

41157 

253 

41241 

281 

41325 

309 

41409 

337 

41493 

365 

41577 

393 

41661 

421 

41745 

449 

41829 

477 

41913 

505 

41997 

S33 

42081 

561 

42165 

589 

42249 

617 

42333 

64S 

42417 

673 

dplORPOOS 

23674 

1 

23758 
29 

23842 
57 

23926 
85 

24010 

113 

24094 

141 

24178 



IDASKNEDFDPFNFDADQQMDDTVPEHI 
atcgaaaaatattgggcccagccagatagagcctggggatttaagaagaagcaagaaattaaagaccaggcaaaagccgaagga 

I EKYWAQLDRAWGFKKKQE I KDQAKAEG 
attcttattaaggataacggaggcaagacagctgatgctcagcgccaatgtttgaactcagttattcaaggaacggcagccgac 
I L I K D NGGKIADAQRQCLNSVIQGTAAD 
atgactaagtacgcaatgattaaggtacacaatgacgctgaattgaaagaattaggattccatttaatgattccagttcacgat 
MTKYAMIKVHNOAELKELGFHLMIPVHD 
gagttactaggtgaggttcctatcaagaacgcaaaacggggagcagaaaggttgacagaagttatgattgaagcagccaaggac 
E LLGEVP IKNAKRGAERLTEVMI EAAKD 
attattagtcttccaatgaaatgtgaccccagtatagtagaaagatggtatggtgaagaaattgaaatctaa 55877 
I ISLPMKCDPSIVERWYGEEIEI* 

atgacaaaatttatcaactcatacggccctcttcacttgaacctttacgtcgaacaagttagtcaggacgtaacgaacaactcc 

MTKFINSYGPLHLNLYVEQVSQDVTNNS 

tcgcgagttagttggcgagctactgtcgaccgcgatggagcttatcgaacgtggacttatggaaatattagtaacctttccgta 

S R VSWRATVDRDGAYRTWTYGNISNLSV 

tggttaaatggttcaagtgttcatagcagtcacccagactacgacacgtccggcgaagaggtaacgctcgcaagtggagaagtg 

WLNGSSVHSSHPDYDTSGEEVTLASGEV 

actgttcctcacaatagtgacgggacaaagacaatgtccgtttgggcttcgtttgaccctaataacggcgttcacggaaatatc 

TVPHNSDGTKTMSVWASFDPNNGVHGNI 

actatctctactaattacactttagacagtattccaaggtctacacagatttctagttttgagggaaatcgaaatctaggatct 

T I STNYTLDS I PRSTQISSFEGNRNLGS 

ttacatacggttatctttaaccgaaaagtgaactcttttacgcatcaagtttggtaccgagttttcggtagcgactggatagat 

L HT V I FNRKVN S F T HQVWYRVFGS DW I D 

ttaqgtaagaaccatactactagcgtatcctttacgccgtcactggacttagcaaggtacttacctaaatcaagttccggaaca 

LGKNHTTSVSFTPSLDLARYLPKSSSGT 

atqqacatctgtattcgaacctataacggaactacgcaaattggtagtgacgtctattcaaacggatggaggttcaacatcccc 

M DICIRTYNGTTQIGSDVYSNGWRFNIP 

gattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagattttaacagggaacaacttc 

D SVRPTFSGISLVDTTSAVRQILTGMNF 

ctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggatccactatccaagcatttcacgctgag 

LQIMSNIQVNFNNASGAYGST IQAFHAE 

ctcgtaggtaaaaaccaagctatcaacgaaaacggcggcaaattgggtatgatgaactttaatggctccgctaccgtaagagca 

L VG KNQA IN ENGGKLGMMN FMGSATV RA 
tqqqttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaat 

W V T D T R GKQSNVQDVSINVI EYYGPS IN 
ttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaaggtcgcacctataacggtaggaggt 

F SVQRTRQN PAI IQALR NAKVAP IT VGG 
caacagaaaaacatcatgcaaattaccttctccgtggcgccgttgaacactactaatctcacagaagatagaggttcggcgtca 
QQKNIMQITFSVA PLNTTNFTEDRGSAS 
qqqacgttcactactatttccctaatgactaactcgtccgcgaacttagctggtaactacgggccggacaagtcttacatagtt 
G T FTTISLMTNSSANLAGNYG PDKSY IV 
aaggctaaaatccaagacaggttcacttcgactgaatttagtgctacggtagctaccgaatcagtagtccttaactatgacaag 
K AKIQDRFTSTEFSATVATESVVLN YDK 
gacggtcgacttggagttggtaaggttgtagaacaagggaaggcagggtcaattgatgcagcaggtgatatacatgctggaggt 

DGR LGV GKVVEQGKAGS IDAAGD IYAGG 
cqacaagttcaacagtttcagctcactgataataatggagcattgaacaggggtcaatataacgatgtttggaataagcgtgaa 
R QVQQFQLTDNNGALNRGQ YNDVWNKRE 
acaqagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggggactatttcaaaatttctgg 
T E FTW RSNKYE DNPTGTRGEWGLF QNFW 
ttagatagctggaaaatggttcaatccttcattacaatgtcaggaagaatgttcatcaggacagcgaacgatggaaacagctgg 
LDSWKMVQSFITMS GRMFI RTAN.DGNSW 
agacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataattggcagaaacttgttcttcaaagtgggtgg 

R P N K W K EVL F KQ D F E QNNWQ-KLV LQS GW 
aaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatagtatatttgagaggaaatgtgcataaagga 

N HHST,YGDAFY'S KTLDG IVYLRGNVHKG 

cttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttcaggctctcaataac 

LIDKEATIAVLPEGFRPKVSMYLQALNN 

tcatatggaaatgccattctatgtatatacactgacggaagacttgtggtgaaatcgaatgtagataattcttggttaaat^ta 

SYGNAILCIYTDGRLVVKSNVDNSWLNL 

gacaatgtctcatttcgtatttaa 42440 
DNVSFRI * 

atggctaaaaaatcaaaagctatctcacacacagacgaactgattagtcagtcgtttgacagccccttggcaaagaatcaaaag 

MAKKS KAISHTDELISQSFDS PLAKNQ K 
ttcaagaaagagcttcaggaagttgaaaagtattatcaatacttcgacggatttgatgtcacggacttgaatactgactatggg 
FK KELQEVEKYYQYFDGFDVTDLNTDYG 
caaacatggaagattgacgaagactcagtcgactataaacctactcgagaaattcgaaactatattcgacaacttatcaaaaag 
QTWKI DEDSVDYKPTREIRNYIRQLIKK 
caatcacgctttatgatgggtaaagagccagagcttatctttagtccagttcaagacaatcaagatgaacaggccgagaacaag 
QSRFMMGKEPELIFSPVQDNQDEQAEN K 



cacagtaggtaag 



cgtatcctattcgactctattttaaggaattgtaaattctggagcaaaagtacaaatgcattagtcgacgc 
R ILFDSILRNCKFWSKSTNALVDATVGK 
cgggtattgatgacagtagtagcaaatgccgctcaacaaattgacgtccagttttattcaatgcctcagctcacctatacagtt 
R v LMTVVANAAQQIDVQ FYSMPQFTYTV 

agcttgctttctgttgacattgtttatcaggacgagcgtacaaaaggaatgagcactgaaaaacaa 



gaccctagaaacccttcc 
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169 DPRNPSSLLSVDIVYQDERTKGMSTEKQ 

24 262 ctttggcatcattatagatatgaaatgaaagctggaacaagtcaatcaggaattgcaacagctttagaagacattgaagaacaa 

197 LWHHYR YEMKAGTSQSGIATALEDIEEQ 

2434 6 tgttggctcacttatgccttaacggatggagagtcgaaccaaatctatatgacagaaagtggccaaactactatcaaggagaca 

225 CWLTYALTDGESNQIYM TESGQTT I KET 

24430 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttcctttaaaagttcaagaatccgccccaaccggc 

253 EAKLVE .I EDNLGNKIEVPLKVQESAPTG 

24 514 ttgaagcaaattccttgtcgagttattcttaatgaaccattgactaatgacatatacgggacaagcgatgtcaaagaccttatc 

281 LKQI PCRVILNEPLTNDIYGTSDVKDLI. 

24 598 acagtagcagataacttgaacaaaactattagtgacttacgagattcacttcgatttaaaatgttcgagcagcctgttatcatt 

309 TVADNLNKTISDLRDSLRFKMFEQPVII 

24 6 82 gatggctcttctaagtcaattcaaggaatgaagattgcgccaaacgctttggtcgaccttaagagtgaccctact;tcctcaatc 

337 DGSSKSIQGMKIAPNALVDLKSDPTSSI 

24 766 ggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccagcggctgaatattatttagag 

365 GGTGGKQAQVTS I SGNFN F L PAAEYYLE 

24 850 ggcgctaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaattgcaatgcag 

393 GAKKAMYELMDQPMPE KVQEA PSG I A M Q 

24934 ttcttattccacgacctaatttctcgatgtgacggaaaatggattgagtgggatgatgctattcaatggctcattcaaatgctg 

421 FL.FYDLI SRCDGKWI EWDDA I QWL I QML 

2 5018 gaagaaattttagcaacagtgaatgttgacttgggaaatattcctcaagatattcaatcaagttatcaaacacttacgacaatg 

449 E E I LATVNVDLGNI P QD I QS S Y.QTLTTM 

25102 actatcgaacaccactatccaattcctagcgatgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgc 

477 TIEHHYPI PSDELSAKQLALTEVQTN VR 

25186 agccaccaatcttacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcag 

505 SHQ SYIEEFSKKEKADKEWERI LEELAQ 

25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcctcaagatgaaacgagtgaa 

533 LDE I SAGALPVLANELNEQE E PQDETS E 

25354 gaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacgttcaaggttaa 
25434 

561 EDEVDDKEKEQTEQ PTEEGVD PDVQG * 
dplORF006 

45296 atgattgaaatcgttatagcacgttcgaaagctaggcgaggtcgaaccctatttattgaaacatgggcaagcactgatgaagat 

1 M I E I V IARS KARRGRT L F.I E T WA S T D E D 

453 80 gcagttaaaatggcagaaaagatttccagcttgcccaatgtagtcgagacgtcttctaataacttcgaactaccttataagtat 

29 AVKMAEKISSLPNVVETSSNNFELPYKY 

4 54 64 ttcaataatgttatagacgctctagatgaatgggagcttcacatcttcggcgaacttgataaagatgttcaagactacattgac 

57 FNNVIDALDEWELHIFGELDKDVQDYID 

4 554 8 tctcgaaaccgaatagcttcttcaagcaatgagcagttttcgttcaagactactccattcgcgcaccaggttgaatgtttcgaa 

85 SRNRIASSSNEQFS FK.TT P FAHQVECFE 

45632 tacgcacaagagcatccatgtttccttttaggcgatgagcaaggtttagggaaaactaaacaggcaattgatattgcagttagc 

113 YAQEHPCFLLGDEQGLGKTKQAIDIAVS 

45716 aggaaggcaagtttcaaacattgtttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtattcat 

141 RKAS FK HCLI VCC I SGL K WNWAKEVG IH 

45800 tcaaatgagtcagctcatattttaggaagtcgagtcactaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 

169 SNESAHI LGSRVTKDGKLVI DGVSKRA E 

4 5884 gacttgcttggtggccacgacgaattcttccttatcactaacattgaaactcttcgcgatgctgtgttcattaaatacttaaat 

197 DLLGGHDEFFLITNIETLRDAVFI KYLN 

4S968 gaactgacaaaaagcggagaaattggaatggttactattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 

225 ELTKSGEIGMVIIDEIHKCKN PSS KQGA 

46052 tcaattcaaaagctccaaagttattacaagatgggacttacaggaactcctctaatgaataacccaatcgatgtattcaatgtt 

253 S I QKLQS YYKMGLTGT P L M N N P I D -V FNV 

46136 atgaagtggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaatcact 

281 M KWLGAE H H T LTQ F KE R Y C I V D Q F NQ I T 

4 6220 ggatatcgaaatctagctgaacttcgcgagcttgtcaacgactacatgcttagaagaacgaaggaagaagttttagacctgcct 

309 GYRNLAELRELVNDYMLRRTKEEVLDLP 

46304 gaaaagattcgagtcacagagtatgtcgacatgaactcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 

337 EKIRVTEYVDMNSKQSKI YKE VLTKLV Q 

46388 gaaatagataaagtcaagctcatgcctaaccctctagccgaaacgattcgacttcgacaagcgactggaaatccttcgatttta 

365 EIDK VKLMPNPLAETI RLRQATGNPSIL 

464 72 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagtcctgcgtg 

393 TTQDV KSCKFER CI EIVEEC I QQG KSCV 

46556 atatttagcaattgggaaaaggttattgaacctcttgctaagatactttcgaagacagtcaaatgcaacctggtaacaggagaa 

421 I FSNWEKVIEPLAKILSK. TVKCMLVTGE 

4664 0 accgcagataagttcaacgaaattgaagaatttatgaatcacagaaaggcttctgttattttaggaactataggtgcgctagga 

449 TADKFNE I EEFMNHRKASV I LGTIGALG 

4 6724 acaggatttactttgacgaaagcggatacggttattttcttagatagtccgtggacacgcgcagaaaaggaccaagccgaagat 

477 TGFTLTKADTVIFLDSPWTRAEKDQAED 

4 6 808 aggtgtcatagaattggcgcaaaaagttctgtcactatctacacgcttgtcgccaaaggtactgttgacgaacgtatagaagac 

505 RC HRIGAKSSVTIYTLVAKGTVDERIED 

4 6892 cttattgaacggaaaggagaattagcagattataccgtagacggtaagcctatgaaatctaaaattggtaaccttttcgatatc 

533 LIERKGELADYIVDGKPMKSKIGNLFDI 

46976 ctgcttaaatag 46987 

561 L L K * 
dplORF007 

22230 atgacaataagcctgagaaataaactacctaagttcaacttcgtcccttttagtaagaaacaactccagctcctaacatggtgg 
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169 dprnPSSLLSVDIVYQDERTKGMSTEKQ 

24262 ctctggcatcattatagatatgaaatgaaagctggaacaagtcaatcaggaattgcaacagctttagaagacattgaagaacaa 

197 LW H HYRYEMKAGTSQSGIATAL EDIEEQ 

24346 tgttggctcacttatgccttaacggatggagagtcgaaccaaatctatatgacagaaagtggccaaactactatcaaggagaca 

225 CWLTYAtiTDGESNQIYMTESGQTT I KET 

24430 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttcctttaaaagttcaagaatccgccccaaccggc 

253 EAKLVEIEDNLGNKIEVPLKVQESAPTG 

24514 ttgaagcaaattccttgtcgagttattcttaatgaaccattgactaatgacatatacgggacaagcgatgtcaaagaccttatc 

281 LKQI PCRVILMEPLTNDIYGTSDVKDLI, 

24598 acagtagcagataacttgaacaaaactattagtgacttacgagattcacttcgatttaaaatgttcgagcagcctgttatcatt 

309 TVADNLNKT I SDLRDS LRF K M F EQ PV I I 

24682 gatggctcttctaagtcaattcaaggaatgaagattgcgccaaacgctttggtcgaccttaagagtgaccctacttcctcaatc 

337 DGSSKSIQGMKIAPNALVD LKSDPTSSI 

24766 ggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccagcggctgaatattatttagag 

365 GGT GGKQAQVTSISGNFMFLPAAEYYLE 

24850 ggcgctaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaattgcaatgcag 

393 QAKKAMYELMDQPMPEKVQEAPSGIAMQ 

24934 ttcttattctacgacctaatttctcgatgtgacggaaaatggattgagtgggatgatgctattcaatggctcattcaaatgctg 

42 1 FLFYDLISRCDGKWIEWDDAIQWL IQML 

25018 gaagaaattttagcaacagtgaatgttgacttgggaaatattcctcaagatattcaatcaagttatcaaacacttacgacaatg 

449 EE ILATVNVDLGNI PQDIQSSYQTLTTM 

25102 actatcgaacaccactatccaattcctagcgatgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgc 

477 TIEHHYPI PSDELSAKQLALTEVQTNVR 

25186 agccaccaatctcacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcag 

505 SHQSYIEEFSKKE KADKEWERI LEELAQ 

25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcctcaagatgaaacgagtgaa 

533 LDEI SAGALPVLANELNEQEEPQDETSE 

253 54 gaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacgttcaaggttaa 

25434 

561 EDE VDDKE KEQT EQPTEEGVDPDVQG* 
dplORF006 . 

45296 atgattgaaatcgttatagcacgttcgaaagctaggcgaggtcgaaccctatttattgaaacatgggcaagcactgatgaagat 

1 M I E I V I A RS KARRGRT L F I E TWA S T D E D 

45380 gcagctaaaatggcagaaaagatttccagcttgcccaatgtagtcgagacgtcttctaataacttcgaactaccttataagtat 

29 A VKMAEKISSLPNVVETSSNNFELPYKY 

45464 ttcaataatgttatagacgctctagatgaatgggagcttcacatcttcggcgaacttgataaagatgttcaagactacattgac 

57 fnnvIDALDEWELHIFGELDKDVQDYID 

4554 8 tctcgaaaccgaatagcttcttcaagcaatgagcagttttcgttcaagactactccattcgcgcaccaggttgaatgtttcgaa 

a5 s rnriasssneqfsfkttp fahqvecfe 

45632 tacgcacaagagcatccatgtttccttttaggcgatgagcaaggtttagggaaaactaaacaggcaattgatattgcagttagc 

H3 YAQEHPCFLLGDEQGLGKTKQ AIDIAVS 

45716 aggaaggcaagttccaaacattgtttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtattcat 

141 RKAS FKHCLIVCC ISGLKWNWAKEVG IH 

45800 tcaaatgagtcagctcatattttaggaagtcgagtcactaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 

169 SNESAHILGSRVTKDGFCLVIDGVS KRAE 

45884 gacttgcttggtggccacgacgaattcttccttatcactaacattgaaactcttcgcgatgctgtgttcattaaatacttaaat 

197 DLLGGHDEFFLI TNIETLRDAVFI KYLN 

45968 gaactgacaaaaagcggagaaattggaatggttattattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 

225 E'LTKSGEIGMVI IDE IHKCK NPS S KQGA 

46052 tcaattcaaaagctccaaagttattacaagatgggacttacaggaactcctctaatgaataacccaatcgatgtattcaatgtt 

25 3 S IQKLQSYYKMGLTGTPLM- NNPI.DVFNV 

46136 atgaagtggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaatcact 

281 M K W L GAEHHTLTQF KERYCIVDQFNQIT 

46220 ggatatcgaaatctagctgaacttcgcgagcttgtcaacgactacatgcttagaagaacgaaggaagaagttttagacctgcct 

309 GYRMLAELRELVNDYMLRRTK .EEVLDLP 

46304 gaaaagattcgagtcacagagtatgtcgacatgaactcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 

337 ekirvTEYVDMNSKQSKIYKEVLTKLVQ 

46388 gaaatagataaagtcaagctcatgcctaaccctctagccgaaacgattcgacttcgacaagcgactggaaatccttcgatt^ta 

365 EIDKVKLMPNPLAETIRLRQATGNPSIL 

46472 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagtcctgcgtg 

393 ttqdVKSC KFERCIEIVEECIQQGKSCV 

46556 atatttagcaattgggaaaaggttattgaacctcttgctaagatactttcgaagacagtcaaacgcaacctggcaacaggagaa 
421 IFS NWEKVIEPLAKILSKTVKCNLVTGE 

46640 accgcagataagttcaacgaaattgaagaatttatgaatcacagaaaggcttctgtcattttaggaactataggtgcgctagga 

449 TADKFNEI EEFMNHRKASVI LGT I G A L G 
46724 acaggatttactttgacgaaagcggatacggttattttcttagatagtccgtggacacgcgcagaaaaggaccaagccgaagat 

477 TGFTLTKADTVIFLDS PWTRAEKDQAED 
46808 aggtgtcatagaattggcgcaaaaagttctgtcactatctacacgcttgtcgccaaaggtactgttgacgaacgtatagaagac 
505 RCHRIGAKSSVTIYTLVAKGTVDBRI B D 

4 6892 cttattgaacggaaaggagaattagcagattacatcgtagatggtaagcctatgaaatctaaaattggtaaccttttcgatacc 
533 LI ERKGELADYIVDGKPMKSKIGNLFDI 

46976 ctgcttaaatag 46987 

561 L L K * . 

dplORF007 atgacaataagcctgagaaataaactacctaagttcaacttcg 
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l MTiSLRKK LPKPKPVPPSKKQtQLLTWH 

22314 acaaagggctcacctctccgaactttcgacaccgtcatagcagacggttccattcgttcaggaaaaacagcaccgacggccctt 

29 TKGS PFRTFDIVIADGS I RS GKTVSMAL 

22398 ccattctcccettgggccatgacggaattcaacggacaaaactctgccatctgtggtaagacaatccacccagctcgacgaaac 

57 SF SLWAMTEFNGQNFAICG.KTIHSAR 

22482 gttatccagcctctaaagcaaatgctcacaagtcgcgggtatgaaattcgagacgttcgaaacgaaaacctacttatcaccaga 

85 viQPLKQML TSRGYEIRDVRMENLLIIR 

22566 caccttagaaatggcgaagaaattgccaactacttctatatatttggaggaaaagatgagtcgagccaagaccttacacagggg 

113 HFRNGEEIVNYFYI PGGKDE SSQ DLI QG 

22650 gtaacattagcaggtatcctctgcgatgaggcggcactgatgcctgaatcgttcgtcaaccaagcgacagggcgctgttccgta 

141 VTLAGIFCDEVALMPESFVNQATGRCSV 

22734 acaggtccgaaaatgtggttctcttgtaacccggccaatcctaatcactacctcaagaagaactggattgacaaacaggtcgaa 

1S9 TG SKMWFSCMPANPNHYFKKMWIDKQ V E 

22818 aagcgcatcttataccttcactttacaatggacgacaaccctagcttgacggatagcattaaaaggcgctatgagaaaatgcac 

197 K R ILY LHFTMDDNPSLTDS I KRRYEKMY 

22902 gctggagtcttcaggaaaagatttattctcggcctttgggtaacagcagatggtctagtttatccaacgttcaatgaagagcag 

225 AGVFRKRFILGLWVT ADG LVYSMFN E E Q 

22986 catgtcaaaaagctcaatatagaattcgaccgtttatccgtagcaggcgactttggtatctataatgcaacaaccttcggcctc 

253 hvkkLNIEFD RLFVAGDFGIYNATTFG L 

23070 catggattctcgaaacgtcataagcgctaccatctaactgagtcacactaccactcagggcgcgaggcggaagagcaactaacc 

281 Y G F S K RHKRYHLIES YYHSGREAEEQ L T 

231S4 qaggcggatgttaattcgaatattcaatttagtccagtcctacaaaagactaccaaagagcacgcaaatgatctagtcgacatg 

309 I advns niqfss vlqkttkeyandlvdm 

23238 atacgaggaaagcaaatcgaacatataattctcgacccgtctgcccctgctatgactgttgaacttcaaaagcatccttatata 

337 IRGK QIEYIILDPSASAMIVELQKHPYI 

23322 gctagaaagaatatccctatcattcctgctcgaaatgacgtgacgcttggcatttcatttcacgctgaactcttggctgagaat 

365 AR K NIP IIPARNDVTLGISF HAELLAEN 

23406 agatccacactcgaccctagcaacacgcacgacactgacgaataccacgcttacagccgggacagtaaagcgagccaaacggga 



393 
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23490 gaagacagagtcactaaagagcatgaccactgcacggacaggaacagatatgcctgtctcaccgacgccctaatcaacgatgac 

421 ioRVIKEHDHCMDRNRYACLTDALINDD 

23574 ttcggtttcgaaacacaaatattatccggaaaaggcgctagaaactaa 23621 

449 pGFElQI LSGKGARN* 

dpiORFOOS a g Ct tca agtcttaaataaagttctcgaagaaaagagcttatccattttagaaaataatggaattgaccaagaatac 

1 v iQ LQVLNKVLEEKSLSILENNG IDQ E Y 

49708 ttcacggattatttagacgagtatcaatttattcaagaacacctttogagatatggaagagttccggacgacgaaaccattctc 

29 FTDYLDEYQFIQEHFSRYGRVPDDETIu 

49792 gaccattttcctggactcgaatttttcgaaattggcgaaactgacgaatacctcatcgacaagctaaaagaggagcatctatat 

57 DHFPGFEFFEIGETD,EYLI DKLKEEHLY 

49876 aattcacctgttccaattttaacggaagcggctgaggacatccaagtagacagtaacattgcgactgcgaatataattccaaaa 

85 nsLVPILTEAAEDIQVDSNIAIA NIIPK 

49960 ctagaagaactcctcaatcgctctaaattcgtaggcggaccagacattgctcgaaatgctaaactecgactagactgggcgaat 

U3 LEELFMRSKFVGGLDIARNAKLRLDW A N 

50044 actatcagaaaccatgacggtgaaagacttggaatatcgacagggtttgaactattggacgacgcgcttggaggcttacttcct 

141 xiRNHDGE RLGISTGFELLDDVLGGLLP 

50128 ggtgaggatttgattgtcataacggctcgacccggacaaggtaagccgtggactattgacaaaacgcttgoaactgcttggaag 

169 G E D L I VI M ARPGQGKSWT I D KMIiATAW K 

50212 aacgggcatgatgcccttctatatagcggggaaacgagtgaaatgcaagttggtgctcgtatagataccattctttcgaatgtt 

197 N G H DVLLYS G EMS EMQVGAR I DT I LS NV 

50296 agcatcaattcaattaccaaagggatttggaacgaccatcagttcgaaaaatacgaggaccacattcaagcaatgactgaggct 

225 S I M S I T KG I W N D HQ F E K Y E D H I Q AM T 

50380 gaaaattcccttgtggtagtcacgccctttatgattggaggaaagaaccttacccctgcaattttagatagcatgatatctaaa 

253 ENSLVVVTPFMIGGKNLTPAILDSMI SK 

50464 catagaccatctgtggtggggattgaccagctttcacccatgagcgagtcctatccaagcagggagcagaagcgaatccagtac 

, al YRPSVVOIDQL SLMSESYPSREQKRIQ* 

50548 gccaacatcaccacggacctatacaagatttctgctaaatatggaattcctatcgtgcttaatgtccaagcagggcgttcggct 

309 a M I T M D L Y K I S A K Y G I P I V L M V Q A G R S A 

S0632 aaaactgaaggcgctgaaagtatggaactagaacatatagcagaaagtgatggagtaggtcaaaatgctagcagagttatcgct 

337 KTEGAESMELEHIAESDGVGQNASRV IA 

50716 atgaagcgtgacgaaaaatccggcatacctgaactacctgtcgctaaaaaccgacatggcgaagaccgaaaaatcaccgaatat 

365 M K R DEKSGILELSVVKNRYGEDRKIIEY 

S0800 acgtgggacgtcgaaactggaacctatactcttataggatccaaagaggaaggcgaagaaggaaccgaaaaaggcgaaagctcc 

393 M W D VETGTYTLIGF KEEGEEGTEKGESa 

S0884 ccattgaaagcaaaagcctctaggtcgaccgcccgtcttcgaagtaaggttacaagggaaggagttgaagcactttga 50961 

421 PL KAKASRSTARLRS KVTREGVEAF* 
till" 00 ' acgacagactctaaaaaacgcttcaagaaagcagtaacagaaacaatcaatcgcgacggtatcgagaaccttatggaccggctc 

1 M TDFKKRFKKAVTET IN RDG I ENLMDnii 

13244 gaaaatgataccaatttctcctcaagcccagcaagcactcgataccatggaagctacgaaggtggacttgtcgagcacccacta 

29 Imdtnpfsspast ryhgsyegg-lv e H S L 

13328 aacgtgctcaatcaactactttccgaaatggataccacggtaggcaaaggctgggaagacatttacccaatggaaacagtcgca 

57 nvfhQLLFEMDTMVGKGWEDIYPMETVH 

13412 accgtagcaccatttcacgacccttgcaaagttggtcagtatcgcgaaactgaaaaatggcgcaagaacagcgacggtgaatgg 



85 

13496 



ivalfhdlckvgqyretekwrknsdge W 
gaaagctattcagcacatgaatacgaccctgagcaacttacaatgggacacggtgcaaaatctaatttccttcttcaacgtccc 
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2902S acagctgtatcagctgttatgattccatcattcgaaggaattgactatgtaggagttctcacaactaattag 29096 

337 tvVSAVMIPSFEGIDYVGVLTTN* 

dplORF012 ctcaaaaccgaagaact:ttcaaa aattgtttctcagctcaataagttgaagcctagcaagttgctagaaatc 
X M S I KFKTEELSKIVSQLNKLKPS KLLEI 

5430 acaaactattggcatatttttggtgacggcgaatgcgtcatgtttacagcgtatgatggctcaaacttccttcgatgcattatc 

29 TNYWH I FGDGECVM FTAYDG S M FLRC 

5S14 qacagcgatgttgaaattgacgtgattgtgaaagcagagcagtttggaaaacttgtagaaaagaccacggccgcaaccgtcaca 

57 d s DVEIDVIVKAEQFGKLVEKTT AATVT 

5598 ttaqttcccgaagaatcttcgctaaaagttattgggaatggtgagtacaatattgatattgttacagaagatgaagagtaccct 

85 lvpeesslkvigngeynidivtedeeyp 

5682 acattcgaccacttgctcgaagaegtgagtgaagaaaatgctctcactttgaaaagctcgctgttctacggaatcgccaatatc 

113 TFDHLLEDVSEENALTLKS S LFYG IANI 

5766 aacgattctgcggtatctaaatcaggagcagatggaatttataccggcttcctgttaaaaggcggaaaagcaattactacagac 

141 NDSAVSKSGADGIYTG FLLKGGKAITTD 

5850 atcatccgcgtatgtatcaaccctatcaaggaaaagggactagaaatgctcattccttacaacctaatgagtattttagcaagt 

169 i i R VCINPIKEKGLEMLI p Y N L M S I LAS 

attcctgatgagaagatgtacttctggcaaattgacgatactactgtctatatttcatcggcttcagtcgaaatttatggaaaa 

I pDEKMYFWQIDDTTVYI S SASVE IYGK 
ttgatggaaggtatggaagattatgaagacgtttcacagcttgactcaattgagtttgaagatgatgcggctatccctacagca 
LMEGMEDYEDVSQLDSIEFEDDAAIPTA 
gaaatcctgagcgtattagaccgccttgtactattcacttcagcctttgacaaaggaaccgtcgaattcttattcttgaaagac 

253 e I L SVLDRLVLFTSA F 0 K-GT V E F L FLKD 

6186 cgacttcgaactaaaacttctactagcagttatgaagacatcatgtacgcatctgctggcaagaaagtttcgaagaaagaattc 

281 R L R I KTSTS S Y E D I MY A S AG K K V S ^ 

acttgccaccttaacagcttactcttgaaggaaattgtaccaaccgtcaccgaagaaaacttcactgtctcttatggaagcgaa 

TC HLNSLLLKEIVSTVTEENFTVSYGSE 

•cgcaattaagatttcatcgaatggtgtcgtttacttcctagcacttcaagagccggaagaataa 6419 

AIKISSNGVVYFLALQE PEE * 

atgaatttagcttctaaataccgtcctcaaactttcgaggaagtggtagctcaagaatatgtcaaagaaattcttttgaatcaa 
M NLASKYRPQTFEE V VA QEYVKEILL N Q 
10299 ttacaaaatggcgctatcaaacacggctatctattctgtggtggcgctggaactggtaaaaccactactgctcgaattttcgcg 
29 LQNGAIKHGYLFCGGAGTGKTTTARI F A 

aagqatgtgaacaaaggacctggctctcctattgaaattgatgctgcttctaataatggggtagaaaatgttcgaaacattatt 
K DVNKGLGSPIEIDAASNNGVENVRNII 
gaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgctttcaaccggagcattc 

EDSR YKSMDSEFKVYI IDEVHMLSTG A F 
aatgcgctgttgaaaacattagaagagccctcatcgggaaccgtgttcattctatgtactactgaccctcaaaagattcctgac 
NA L L KTLEEPS SGT VFILCTTDPQK 
actattctcagtcgagttcaacggtttgactttactcgaattgataatgacgacatcgttaatcaacttcaatttattatcgaa 
TILS RVQRFDFTRIDNDDI VNQLQFIIE 
agtgaaaatgaagaaggagctggttatagttatgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgt: 

S ENEEGAGYSYERDALS F I G KLANGG M R 
qacagtatcacaaggctcgaaaaagtccttgattatagtcatcacgttgacacggaagccgtttctaatgcactaggagttccg 

197 D S I TRLEKVLDYSHHVDM EA V S NA LG VP 

10887 gactacgaaacattcgcttcacttgttgaagctattgccaactatgacggctcaaagtgtttagaaattgtaaatgacttccac 
DYETFASLVEAIANYDGS KCLE IVNDFH 
cactcaggaaaagacttgaaattagtgactcgaaaccttacagacttccttttagaggtttgtaagtattggctagttcgagat 

Y S G KDLKLVTRNFTD FLL EVCKYWLVR D 
atttcaatcactcaaettcctgctcattttgaaagtaagctagagcaattctgtgaggcttttcaatatcctactctattgtgg 
ISITQ LPAHFESKLEQFCE AFQY.PTLL W 
atgctagaagaaatgaatgaacttgctggagttgttaaatgggagcctaatgctaaaccgataattgaaaccaaacttcttttg 
M L EEMNELAGV VKWEPNAKP I IETKLLL 



169 

5934 

197 

6018 

225 

6102 



6270 
309 

6354 ac 
337 

dplORF013 
10215 



29 

10383 

57 K D V N 

10467 
85 

10551 aa 
113 
10635 
141 
10719 
169 
10803 
197 



225 D Y E T F A S 

10971 
253 
11055 
281 
11139 

309 M L E E M 

11223 atgagcaaggaggagtga 1124 0 
337 M S K E E * 

dplORF014 



50961 F014 atgaaagtaaatggtcttcaaattgaagcgactcctgaacaaataactgaaaaactttcgagacaacttgaagacgaaggaaca 
1 M KVNGLQIEATPEQI I EKLS RQLED EGT 

51045 ttcacttttagacgaactaagtcgcttggaagcaactatcaattctcatgcccgtttcatgcaggagggactgaaaagca^ 
2 o FI FRRTKSLGSNYQFSC P FHAGGTEK HP 

51129 tcttgtggcacgagtagaaatccttcttattcaggaagtaaggtgacggaagctggaacggttcactgtttca^ 
57 SCGMSRNPSYSGSKVTEAGTVHC FTCGi 

51213 acttcaggactaactgaattcgtctcgaatgtattaggtcgaaacgatggaggg^ 

flS tsG LTEFVSNVLGRNDGGFYGNQWLKRN 

51297 tttggaacatctagcgaagtagttaggcaaggcgtcagccctgaagcgtttcgaa^ 

113 pGTSSEVVRQGVSPEAF RRNGRTBitvcn 

aaaatcattcctgaagaggaacttga^ 

KI I PEEELDKYRF IHPYMYERKLTD L I 

gagatgtttgacgtaggttatgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaaca 
E MFDVGYDKLHDCITFPVRNLKGETVFF 
aaccgtcgaagtgttcgttctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagctc 
NR R SVRSKFHQYGEDDPKTEFLYGQYEL 
gtagcatttcgagactattttgaaaaacctattagtcaagtattcgtgactgagtctgttatcaactgcttgactctttggtca 
VAFRDYFEKPISQVFVTESVINCLTLW S 
atgaagattccagcagtcgctcttatgggagtaggtggaggaaatcaaatcaatttactaaaacgacttccttatagaaatatc 



113 
51381 
141 
51465 
169 
51549 

197 N R R S 

51633 
22S . 
51717 
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s3 MKIPAV ALMGVGGGNQINLLKRLPYRNI 

S1801 g t tctagcact t gaccctgataacgctgggcagacagcgcaggaaaaactc C acc9acagttaaagcgaagcaaggtcgctaga 

281 VLALDPDNAGQTAQBKLYRQLKRSKVVR 

S1B85 tttttgaactaccctaaagagctccatgataataagtgggatataaacgaccatccggaattattaaatttcaatgatttagtc 

309 plnyPKEF.YDNKWDINDHPELLNFNDLV 



309 

S1969 ttgtag 51974 
337 L * 



SSI" 0 " atgggatttaatctatactccgcaggaggccacgctactagcactgacgattatttgaaggaaagaggagccaatcgcctatcc 

1 M GFNLYFAGGHAISTDDYLKERGANRLF 

3877 aatcaactgtacgaaagaaacgggattggcaaaaggcggattgagcataagaaaaccaatccaagcactacttcaaaactattc 

iq MOLYERNGIGKRWIEHKKTNPSTTSKUt 

»1 gtcgactctagtgcatattctgctcataccaaaggggctgaagttgacattgacgcctatatcgaatacgtgaatgataacgtg 

„ VDSSAYSAHTKGAEVDIDAYI EYV NUnv 

4045 ggaatgtttgactgtatcgccgaactcgataaaattcctggtgtatttagacagcctaagacacgtgaacagcttttggaagca 

oc c m FDCIAELDKIPGVFRQPKTREQLLEA 

129 ccaLaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattttccatatggga 

,1, poTSWDNYLYMRERMVEKDKLLPIFHMG 

3 gaagactttaaatggctcaacttgatgctcgaaactacattcgaaggcggaaagcatattccttacactggaatttcaccagcc 

... PDPKWLNLMLETTFEGGKHIPYIGISPA 

J 2 97 aatgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaag 
MnSTTKHKOKWM-ERVFEVlRMSSNFUv* 

1 ■ rrrrrrrrrrrrmr'^^ 



281 

4717 cgactattttag 4728 

3 09 



44253 aaacctcaattcaccgtagagccggacgggctcattactgctaaagtttaa 44303 
281 KPQFTVE PDGLITAKV* 



16 9 
11830 
197 
1L914 
22 5 
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11998 cgtttgcgaaaggtatctaaaaagggctcaaatgcgcgtgtctgcgtgaacgaatttatcaggagggtcaaacaagttgagtga 
2S3 CLRKVSKKG 



12081 SNARVCVNEFIRRVKQV 



85 RS*.srn*.A .^r.oft-r^aaAt-arttacqaqtac 



36183 



RSKSFWRISTLEDPU"! - 
ggaaaacttgtagacgttcaagcctttaaagatacctccctcgtagttaaattagggattcagttcaaagatgcttacgagtac 

«67 agcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccaggaagacctactcga 
1A1 qDSTVRKVYKFQPALGGDSLPNFLiKr x « 

E'YRADDAAAWTSTLPAOVSLFLMPSYY* 

12917 gaaagttcagtggtctatatttctttgtataaaattttttcacttacttaa 12 967 
253 ESSVVY I SL YKIFSLT* 



4 e| 4. rrrrr .C99« rr _ 

£1. 9 a«9.. t 9 rrrr c« r; 9, r «.- 

2620 cttcatacacttgtttatgataataaaagaggagtataa 2658 

253 LHTLVYDNKRGV 
dplORF021 
2504 



atgcaaacgcatacgaagaaggaaaaatcagtgataggcttcttgaaaagttgggatggctttgggacaaagtgcatgaagacc 

u t u t K KE KSVI GFL KSWDGFG 1 K. n *v 

2SS8 cagLLaaoaatgttcgacctttaccgcaacttcatacac^ 

2 ' 72 LctagatlaaatcggtaacgtaLa^ 
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2840 qttaaagcactcgctgaacacaccgtagggtatcgagaagaccctaaacttcatctcgaaaaaacattcgacgtcgaccatgaa 

113 VKALAEHTVGYREDPKLHLEKTFDVDHE 

2924 qaccttgttcttgtgaaagacattccattcaattctttatgtgagcatcatttagctccgttcgtagggaaggtgcatattgca 

141 DLVLV K DIPFNSLCEHHLAPFVGKVHIA 

3008 cacattcctaaggataagattacaggtctttcaaaattcggtcgagtggttgaaggacacgctaaacgacttcaagtacaagag 

16 9 yipkDKITGLSKFGRVVEGYAKRLQVQE 

3092 cgcttgactcaacaaatcgctgacgctattcaggaagttctaaatcctcaagcagttgcggtcatcgtagaggctgagcatact 

197 RLTQ Q I A D A I QEVLN P QAVAV I VEAEHT 

3176 tgcatgagcggacgcggtattaagaagcacggggcaacgacagtgacttcaactatgcgaggtctttcccaagatgacgcatcc 

225 c m S GRGIKKHGATTVTSTMRGLFQDDAS 

3260 gctcgagcagaattgcttcagttgattaaaaagtag 3295 

253 ARAELLQLIKK* 

30696 F022 atgagtaaagacattctttacggaatcaagctcgtgcaaatcgaggagcttgacccattgactcagttgccaaaagtcggcgga 

i mskdilygiklvqieeldpltqlpkvgg 

30980 gctaactttgtcgtagatacggcagaaacagcagaactcgaagccgtgacctcggagggaactgaagatgtgaaacgcaatgac 

29 anfvvdtaetaeleavtsegtedvkrnd. 

31064 acgcgcattcttgctatcgtgcgtactccagaccttttatacggttatgacttaacattcaaggacaacacgtttgaccctgaa 

57 t r ilaivrtpdllygydltfkdntfdpe 

31148 atcatggccctaattgaaggtggtacagtacgtcaacaaggcggaactattgctggatacgacaccccaatgcttgcacaaggt 

8S IM ali eggtvrqqggtiagydtpmlaqg 

31232 gcttctaatatgaaaccatttagaatgaacatctatgtgccaaactatgtaggtgactcaattgtcaactacgtgaaaatcact 

113 ASNMKP FRMNIYVPNYVGDS IVNYVKIT 

31316 ttgaataactgtaccggtaaagctccagggctttcaatcgggaaagagttctacgctcctgagttcaacatcaaggcacgtgaa 

141 L N NCT GKA PGLS IGKE F Y A P E F N I K A R E 

31400 gcaaccaaagcaggtttgccagttaagccaatggactatgcggcacaacttccagcggttcttcgtcgcgtgacattcgatttg 

169 AT KAG L P V KS MDYVAQ L P AVLRRVT F D L 

31484 aacggtggaacaggaaccgccgacgcagttcgagttgaagcaggtaagaagatttctccaaaaccagttgaccctaccttaaca 

197 NG GTGTADAVRVEAGKK I S PKPVDPTLT 

31568 ggtaaggctttcaaaggctggaaagttgaaggagaatcaactatttgggacttcgacaaccacatgatgcctgaccgagacgtc 

225 G K A FKGW KVEG ES T I WD F DNHMMPDRDV 

31652 aaactcgtagcacaatttgcatag 31675 

253 KLVAQFA * 

dplORF023 ccaatttaact agaattgcaa ag a tggttagagcaggaaacagtgaaggtcctgcttcatcttttgtcaattcg 

1 M A K SNLTRIAKM VRAGNS EGPASSFV 

6503 c tgacccgggttattgaacgaactcagcctgaatataatccttcgacatattataagcccagcggggttggtggatgtattcga 

29 LTRVIERTQPEYNPSTYYKPSGV GG C IR 

6587 aaaatgtatttcgaaagaatcggtgagtctattatagataacgcagattctaacctaattgcaatgggcgaagctggaacattt 

57 KMYFERIGES I IDNADSNL IAMGEAGT F 

6671 aggcacgaagttctccaagagtacatggttaaaatggctgaaatcgatgaggactttgaatggttgaatgtagcagagttcttg 

8S RH EVLQ E YMV KM A E I D E D F EW LNVAE F L 

6755 aaagaaaatccagttgaaggaactatcgtcgacgagcgtttcaagaaaaacgattatgaaacgaagtgtaagaacgaacttctt 

113 KENPVEGT IVDERFKKNDYETKCKNELL 

6839 caactttcattcttgtgtgacggactagttcgatataaaggcaagctctacattttagagattaagactgaaaccatgttcaag 

141 qlsFLCDGLVRYKGKLYILEIKTETMFK 

6923 ttcactaaacatactgagccctatgaagaacacaagatgcaagcaacttgctacggaatgtgtctaggagtcgatgatgtcatt 

1 69 ftkHTEPYEEHKMQATC YGMC LGVDD V I 

7007 ttcctttatgaaaatcgagataacttcgaaaagaaagcctacacgtttcacatcacagacgagatgaaaaa 

197 " " 

7091 
225 

7175 aaggaaggtcgaaatctgtga 7195 

253 K E G R N L * 

dplORF024 ^ c a g taga tggccaggtagttcatattctacaagtattagcagaagatggaaatgctacggctgaaaagttcgaaaag 

MNAVDGQVVH I LQVLAEDGNATAEKF EK 



J' L Y E N R D U F E K KA YT F H I TD EMKNQVLG 
aaaattatgacctgcgaagagtatgtagagaaaggcgaaagtcctaaaatctattgctcttcagcctattgcccatattgtaga 

KIMTCE EYVEKGES PKI YC S SAYCPYCR 



26076 gaagtcagggctgcatctttagtattttcacgaagagcagccga^^ 

29 EVRAAS LVFS RRAAEAVVKG E IYKDGKN 

26160 ctctcgaaacgtgtttggtcttcagccgcacgcgcaggaaatgatgttcaacaaatagtcacacaaggcctagcaagtggaatg 

57 LS KRVWS S AARAGNDVQQI VTQGLAS G 

26244 tctgctacagatatggctaaaatgctcgagaaatatatcgaccctaaggttcgaaaagattgggactttgataagatagctgag 

85 SATDMAKMLEKYIDPKVRKDWDFDKIA E 

26328 aagctagggaaacctgctgctcataaatatcaaaatctcgaatacaatgcccttcgacttgctcgaactaccattagccattcc 

113 K L G K PAAHKYQNLEYNALRLART T I S H 

26412 gccacagctggagtgagacaatggggcaaggttaatccttatgctcgaaaagttcaatggcattctgttcacgctccaggt^ 

141 ATAGVRQWGKVNPYARKVQWHSVHAPGR 

26496 acgtgtcaagcgtgtatcgatttagatggtgaagtatttcctatcgaagaatgtcctttcgaccatcctaatggaatgtgctac 

169 T C Q A C IDLDGEVFPIEECPFDHPNGMCY 

26580 caaactgtatggtacgaaaactcactcgaagaaatcgctgatgagttgagaggctgggtagacggagaacctaatgatgtatta 

197 qxvWYENSLEEIADELRG WVDGEPN D V u 

26664 gacgaatggtacgacgatttaagttcaggaaaagttgagaaatacagcgacctcgactttgttaaaagttattag 267JB 

225 deWYDDLSSGKVEKYSDLDFVKSY* 

illlY° 2S atggcaaagaacaaaaagcgaaaaaaagtaaatgtcaaaaggaaaatgcttatccctacaa^ 

1 " " " 



m^kmkkrkkvnvkrkmli ptnlskkvnv 
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186 94 aaagcaatcgcttatagaaaagtcactgttaagtggctgcctaatacagatgaaattcaagtatatttcgacctttatataaat 

29 KA IAYRKVTVKWLPN TDEIQVYFDLYIN 

18610 aaaaacaggctgacaatgttaggcactattgacccggacaagagctattttgaaggaattaggattgtttgtaagaaacctcag 

57 KNRLTMLGT IDP-DKSYFEG IRIVCKKPQ 

18 526 ccttggatgactgttaaggagctccaggttgcgcgtgcagacgccccaggtttttttgcagtccttaaagcctattgtcacacg 

85 P WMTVKE LQ VARADA PG F FAVLKAYCHT 

184 4 2 gttggcgatgtactagatagcggagcagagcctactgaaattgttcaaggtattatgtataaagacggtgaactatttaaggac 

113 VGDVLDSGAEPTEIVQGIMYKDGELFKD 

18358 agtgaaattgtcagccttttcaaatacgatgtcaaagagccttatgagtttccaaaggaccttcctataaccttggacaacttt 

14 1 S E IVSLFKYD VKEPYEFPKDLP ITLDNF 

18274 ttagagttcattatgtctagccagcatactagagcacttgttttgcgttgtgctaatataggtgagttttccaagaattggcgg 

169 L E F I M S S Q.H TRALVL'RCAN I G E FS KNWR 

18190 aaatggcaaaaagctatccagctcctgctcgactatgccaaggcggatgactttaaagtagacgaaactgtttgggacttttca 

197 KW Q KA I Q L L LDYAKADD FKVDETVWDFS 

1810 6 cccggctctaaagctggaaaggtagcacgtcgtaaaggctatgaggcaattcaacaagcccttgagcagataaataaataa 

18026 

225 PGSKAGKVARRKGYEAIQQALEQINK* 
dplORF026 

21512 atggcgaaagctactggaccaaaagttcgaagaggaaaaactcctccacggccaaaagacaaaaaaggaatcaaagcaaatgcg 

1 MAKATG PKVRRGKTPPRPKDKKGIKAN A 

215 96 cgtgtcaataaagaccagttcgtagagtatgactataaaggcatcaagatgacaattaaggaacgtgatgctagaatgaaattg 

29 RVNKDQFVEYDYKGI KMTI KERDA RMKL 

21680 gaatttattagaggcatgactattcaggaaattgcagcccgctatggattaaatgaaaagcgtgttggcgaaatacgggctcgc 

57 E FI RGM T IQE IAARYGLNEKRVGEIRAR 

21764 gataaatgggtgaaggctaagaaagagttcgagaatgaaaaggctcttgttactaatgatacactgactcaaatgtatgcaggg 

85 D KWVKA KKE FENEKALVTNDTLTQMYAG 

2184 8 tttaaagtctcagtcaatattaaatatcacgccgcctgggagaaactaatgaacatcgtcgaaatgtgtttagataatcctgac 

X13 FKVSVNI KYHAAWEKLMNIVEMCLDNPD 

21932 agatatttatttactaaagaaggaaatattagatggggcgcattagatgtcctttcgaaccttatagatagagctcaaaaagga 

141 RYLFTKEGN I RWG ALDVLS NLIDRAQKG 

22016 caagaaagagcgaatggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggcc 

169 QERANGM LPEEVRYRLQIEREKITLLRA 

2210 0 aaaatgggcgaccaggaaattgaaggcgaggttaaagataacttcgtagaagcactagataaagcagcticaagccgtttggcaa 

197 KMGDQE I EGEVKDNFVEA LDKAAQAVWQ 

22184 gaatttagtgacgcaacaggttcctacattaaaggagtgactgataatgacaataagcctgagaaataa 22252 

225 E F S DA T G S Y I KGVTDND NK.P E K * 
dplORF027 

52762 atgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtgac 

I MG. KVSIQKSGTFSSGSNNEFFTLADHGD 

5284 6 agcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtagtccacgaagcagacgttgacggt 

29 S A I VTLLYDDPEGEDMDYFVVHEADV DG 

52930 cgtcgacgctatatcaattgcaatgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacgga 

57 R R R Y I NC N A I G E DGETVH P DN C PLCQNG 

53014 ttccctcgtattgaaaaactatttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttat 

85 F PR I EKLFLQLYMHD TGKVETWDRG RSY 

53098 gttcaaaagattgttacatttatcaataaatatggaagccttgtgactcagcctttcgaaattattcgttcaggagctaaaggt 

113 VQKIVTFIN KYGSLVTQPFE I IRSGAKG 

53182 gaccaacgaactacttatgaattccttccagagcgtccggaagacagtgctactcttgaagattttccagaaaagagcgaactt 

141 DQRT TYEFLPERPEDSATLEDFPEKSE L 

53266 cttggaactctaattttagacctcgacgaagaccaaatgtttgacgtggttgacggcaagttcactcttcaagaagagcgttct 

169 it G T L I It D L D E D Q M F D V V D G K F T L Q E E R S 

53350 tcaagtcgttcaaattcacgtagaggagcatctcctgcgcctagacgaggttccggtcgagaatcttcacaaggtcgaacagct 

197 SSRSNSRRGASPAPRRGSGRESSQGRTA 

53434 gaaagaactccttcagttagtcgaagaactcctccaacacgaggtcgaggattctaa 534 90 

225 ERT PSVSRRTP P TRGRGF* 
dplORF028 

44 595 atgtcaaaaattaaattcgaaaaccttaaaaaaggcgatgttgtgctacgagccaaatctcaaacgaagtttaaaatcgtttca 

1 M S K I KFENLKKGDVVLRAKSQTKFKIV S 

44 679 attttagcagacgaaaagaaagcagaccttgaatcattagaagacggaggtgaacttcacctttcagcttcaactctcgaacgt 

29 ILADEKKADLESLEDGGELHL SASTLER 

44763 tggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgctaaacctgctaaaaaggctgctcctgcagttgctcga 

57 WYTMED ETE PKKEEAAKPAKKAAPAVAR 

44847 cctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttgaggaagaaattcctgaagttaaggaacagccggaa 

85 PARKGRVV PKP KKEVLEEEI PEVKEQPE 

44931 gaagttggttcagttagtgagaaatctactgttcgaaaacctgctcctaaaaaagaaagcgtgatggcgattactaaggctctt 

U3 EVG S VS EKS TVRKPAPKKE SVMAITKAL 

45015 gaaagtcgaattgttgaagcccttcctgcgtctactcgaatcgtcactcagtcttacatcgcctatcgctctaagaagaacttc 

141 ESR IVEAFP A STRIVTQSYIAYRSKKNF 

4 5099 gttactatcgaagaaactcgaaaaggtgtttccattggagttcgcgcaaaagggttgacagaagaccaaaagaaacttcttgca 

169 VTIEETRKGVS IGVRAKGLTEDQKKLLA 

45183 cctattgctcctgcaccttacgaatgggcgatcgacggaatttttaaactcgtcaaggaagaagatattgacaccgcaatggaa 

197 siAPASYEWAIDGIFKLVKEEDIDTAME 

45267 ttgattgaagcttctcacctttcttcgctatga 45299 

225 LIEASHLSSL* 
dplORF029 



394 



662 atgaaatcagtagttttattatccggcggagtcgactcagccacttgtttagcaattgaagttgacaagtggggttctaaaaat 

1 MKSVVLLSGGVDSATCLAI E V D K W G S K N 

746 gttcatgctatagcattcaattacggacaaaagcatgaagcagaacttgaaaatgctgctaatgttgcaatgttctacggagtc 

29 vHAIAFWYG QKHEAELENAANVAMFYGV 

830 aagttcaccattcttgaaattgactcgaaaatctactcaagctctagctcttccttattacaaggaaaaggcgaaatttcacat 

57 K FTI LEIOS KIYSSSSS SLLQGKGE I SH 

914 ggaaaatcttacgctgaaatcctagca^agaaggaagtagttgacacctatgttccatttagaaatggactaatgctttcacag 

85 gkSYAEILAEKEVVDTYVPFRN GLMLSQ 

998 g c tgcggcttatgcttattcggttggagcttcttacgtcgtatatggtgctcacgcagacgacgcggctggaggtgcttaccct 

113 aaaY AYSVGASYVVYGAHADDAAGGAYP 

1082 gattgcactcctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaacccttgtcgctcctcta 

141 dctpefynsmsname-ygtggkvtlvapl 

1166 cttactctaaccaaggcgcaagtcgttaaatggggaattgatttagatgttccttatttcttgactcgttcatgttatgaaagt 

169 ltltkaqvvkwgidldvpyfltrscyes 

1250 gacgctgaaagttgtggaacttgcgcaacttgtatcgaccgcaaaaaggcattcgaagaaaatggaatgactgaccctattcat 

197 daescgtcatcidrkkafeen gmtdpih 

1334 tataaggagaattga 1348 

225 Y K E N * 
dplORF030 

20088 atgaataacgaaaaaattattgaaaaaatcaaaaatcttattcaattagcaaatgacaacccgagtgacgaagaggggcaaact 

i mnnekiiekik. nliqlandnpsdeegqt 

20004 gcccttcttatggctcaaaagttgatgctaaagaataatatcgcacttgctcaagttgaacaatttgatgaacctaaacagttc 

29 ALLMAQKLMLKNNIALAQVEQFDEPKQF 

19920 gagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcatattctcgcgactaatttt 

57 etsQAVGKEAGRIFWWERELGHILATNF 

19836 aggtgcttttgtattaatcagcgtgatatgcgcttgaataaaagtcgaataattttcttcggcgaaaaacaagacgctgaatta 

85 rcfcinqrdmrlnksriiffge kqdael 

19752 gtgtccaaaatatatgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaat 

113 vskiyeaa llylryridr lptrepsykn 

19668 tcatacctcaaaggctttttgccagccttagccattcgatttaaaaagcaggtggaagaatattcacttatggtcctacctagc 

141 sylkgflsalairfkkqv eeyslmvlps 

19584 gagcaaacaaaaaatgcgcttcaggacacatttcgaaatttaaagaaggaaggaattgacagacctcaacatgacttcaatctt 

169 EQTKNALQDTFRNLKKBG I DRPQHDF M L 

19500 gaagcgtatattgaagggcggtttcatggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaa 19423 

197 EAY I EGRFH G EMAKIMPD E I LEG G N 

dplORF03l ttatcaatt agaagacttgttaaaaggtctagatgaaccaactatcaaacaggtgaaggaaattacttcgaaaacttcg 

1 M A YQLEDLLKGLDEPTI K QVKEI ISKTS 

27027 aaagaacccgatgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaacag 

29 KELDAKIFIDGDGQHFVPHARFDEVVQQ 
27111 ' cqcqatgcagctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcg 

57 R D A A N G S INSYKEQVATLS KQVKDNGDA 

27195 cagaccactatccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtgattacttcagctcttcat 

85 qttiqnLQEQLDKQSQLAKGAVITSALH 

27279 ccqttqattagtgactccattgctccagcagcagacattcctggatttatgaaccttgacaacattacggtcgaaagtgacggt 

113 p L ISDSIAPAADILGFMNLDN I TVESDG 

27363 aaagttaaaggtcttgatgaagagttgaaagctgttcgtgagtctcgtaaatacttattcaaagaagtcgaagttcccgcagaa 

141 kvKGLDEELKAVRESRKYLFKEVEVPAE 

27447 caagaqgctcaagctaagtcgccagccgggactggaaatttaggaaatccaggtcgtgtcggtggtggtgttcccgaacctcgt 

169 qeaqakspagtgnlgnpgr vggg vpepr 

27531 gaaatcggctcttttggtaagcaacttgctgctgctcaacaaacggcaggagcacaagaacaatcatcattctttaaataa 

197 11 eigSFGKQLAAAQQTAGAQEQSS FFK* 

dplORF032 ^ aagaagcgaatagactagtttc tagctatgtaggattcgaatgccggactgacgaagaatgtatcaggaactttgaacta 

1 MKEANRLVSSYVGFECWTDEECIRN FEL 

52117 gaccctgatatgtcaattgcgtctgcttatcatcgttattttgggatgctttattcctatgcaaaaaggtttaaatgcttatct 
29 D POMS IASAYHRYFGMLYSYAKRFKCLS 

52201 cgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttcaaatcaaaccaaggggccaagttttca 
57 RHD I ES IAFETISKCLATFKSNQGAKFS 

52285 acctaccttacaagactcttcaagaatagaatagtcttagaatataggtacctaaatgcaccttccatgaatcgaaactggtat 
85 tylTRLFK NRIV. LEYRYLNAPSMNRNW Y 

52369 gtagaagtgacgttcgatagcgtttcgacaaatgaagaaggcgacgattttagtatcctatcgacagttggctattgtgaagac 
113 VEVTFDSVSTNEEGDDFSILSTVGYCED 
524S3 tacggaaaaattgaaattgaagcaagtcttgacttcatgacgctttctaatacagagtatgcttatatctcgtctgtcattcaa 
1 41 YGKIEIEASLDFMTLSNTEYAYISSVIQ 
52537 aacggtccttcagtaagcgacgcagaaattgcgcgtgaaattggagtaagcaggtctgctactagtcagtctaagaagtcacca 
169 MGPSVSDAEI AREIGVSRSAISQSKKSL 

52621 aaaaataaattaaaagattttatataa 52647 

197 KNKLKDFI* 

dplORF033 cctaagttaccccaaattgatattcgagaagaagaaat 

1 MARPKLPQIDIREEEIRDAQDVADSY G A 

7754 attatcaataaagtagtcgacgaaattgttgaagcagcttgcggctcacttgaccaggcaatggaagaaattcaaatagtcgca 
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29 I INK VVDE I VEAACGSLDQAMEE IQ IVV 

7838 agccaaaatcctgtcattatggaagaccttaactactacattggctatcttcccactcttctttatttcgccgcagatagggcg 

57 sqnpvIMEDLNYYIGYLP TLLYFAADRA 

7922 • gaaatggtgggaatacaaatggatccaagtcctgctatcaggaaagaaaaatacgataatctatacattttagccgccgggaaa 

85 emVGI QMDSSSAIRKEKYDNLYILAAGK 

8006 actattcctgacaagcaagcagaaactcgaaaacttgtcatgaatgaagaagtcatcgaaaatgcttacaagcgagcctacaag 

113 tipdkQAETRKLVMNEEVIENA YKRAYK 

8090 aaagttcaattaaagctagaacaggccgataaggtattagcatctttaaaacgaattcaaacctggcaactagcagagttagaa 

141 KVQLKLEQADKVLASLKR I QTWQL AE LE 

8174 actcagtcaaataattcaaaaggagtattattaaatgcaaaaagacgtagacgtgaaaatgattga 8239 

169 T QSNNSKGV LLNAKRRRREND* 



169 T Q 

dplORF034 



131 atgagtcaaaacactacacgcactgacgctgaattgacaggcgttactcttttaggaaaccaagacaccaaatacgattatgac 

i mVqnttrtdaeltgvtllgnqdt.kydyd 

215 tataatccagacgtccttgaaactttccctaacaaacatcctgaaaataattacctagtaacatttgacggatatgaattcact 

29 ynpdVLETFPNKH PENNYL.VTF DGYEFT 

299 tcccttcgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatggttgaatctaaa 

57 SLCPKTGQPDFANVFISYI PNEKMVESK 

383 ccattgaaattgtacttattcagtttccgtaaccacggtgacttccacgaagattgcatgaacattattttgaatgacttgtac 

gc sl kl ylfsfrnhgdfhedcmniilndly. 

4 67 gaattgatggaacctaagtacattgaagtcatgggcctattcactcctcgtggtggaatttcaatttacccattcgtcaacaaa 

113 E L M E P KYI EVMGL FT P RGG I S I Y * \ 

551 atgaatcctcaatttgcaactcctgaacttgaacagcttcaacttcaacgcaaattgaacttccttggaaatgttcaaggtctt 

1A1 v nPQFATPELEQLQLQR. KLNF LG NVQ GL 

6 35 ggacgagctattcgatag 652 

169 G R A I R • 

17425 F035 atgcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgag 

1 MHLMKDSKMLRTWKSLAFEFETKVRT TS 

17341 qggttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataaaatgaaggtatttatcaacaat 

i734i ggg^ ^ ^ 5 p amktmtrtki wkgykmkvf i nn 

17257 catactgaagctgatattgactacaaagatattctaaattttgtagcttatcgaaactctcctaaccctcaaattcaaatcacc 

57 HTEADIDYKOI LN FVAYRNSPNPQIQIT 

17173 agctggaacgctttgctttcctgctatacacggaatgagctttcttataaaggagtttcaataacggacttttttgaagccatt 

85 S W N iL LSCYTRNEL SYKGVSlTDFFEAI 

17089 caaactattgcaagttccttcactcacctagactcgaaaacaattgacacacaaaatgaaaagcgactcgaaaggattgaggaa 

113 QTlASSFTHLDSKTIDTQNEKRLERIEb 

17005 cttcagtcaagaataggtcattgtaactgtactatcgacgaacttaaaaaaggagtccacgaaatgccggatattgaatcagcc 

141 lqsrighcnctidelkkgvhempdiesa 

16921 aC ttcttaccagtacggacagattcttgcttatgaagatgaacttaattttctgctaaactaa 16859 

169 I S Y Q YG Q I LAYED E LN F L LN * 

4880r° 36 gtgttagtcgaacgaaaagccgacaaggaatgttgggaatggctagaagctgttcgagcaaatatagtcgaagaagttcgaa^ 

I vLVERKADKECWEWLEAVRANIVE EV R N 

48892 ggtcttagcattgttattgcttcgaatactgtcgggaatgggaaaactagctgggcggttcgacttttgcaacgctatttagca 

GLSIVIASNTV.GNGKTSWAVRLLQR YL A 

48976 gaaactgcacttgacggaagaattgttgagaaaggaatgtttgtagtgtcagctcaactattgactgagttcggcgactataat 

57 iTALDGRi y. EKGMFVVSAQLLTE F GDY N 

49060 tattttcaaaccatgcaagaatttctcgaacgtttcgagcgccttaagacttgtgagctattagtcatagacgaaataggtgga 

a5 YFGTMQEFLERFERLKTCELLV IDEIGG 

49144 ggttccttaaccaaggcctcttatccttatctgtatgacttggtcaattatagggttgacaataacttgtcgactatttatacg 

G SLTKASYPYLYDLVNYRVDNNL.STIYT . 

49228 actaattatactgacgatgaaattattgaccttttaggccaaagg^ 

L41 xnyTDDEIIDLLGQRLYSRIYDTSV VUU 

4 9312 tttcaggcaagcaatgtaagaggattggaggtaagcgaaattgaatcatag 49362 

169 FQASNVRGLEVSEI ES * 

tlllY atggtgaagaaattgaaatctaaaatctattcagttgcatatataattctagtagttattgcgaaccttgtgacaatttat^ 

1 MVKKLKSKIY SVAYI I LVVIANL VT I Y F 

55939 gaacctttaaatgtgaaaggaattttaattcctccaagcagttggtttatgggattcactttcctgcttataaatctaataagc 

29 E PLNVKGI LI PPSS WFMG FTFLL.IN LI S 

56023 aagtacgagaagccaaaatttgcaggttctttgatacgggtagggttattccttacctcgttgatttgctttatgcaaaaccta 

57 K YE K P K FAG S L I W V G L F LT S L I C FMQNL 

56107 ccacaatcgcttgtcgtggcttcaggagttgcattttggataagtcaaaaagca^^^ 

B5 PQS LVVASGVAFW I SQKASVF I F DKL S N 

56191 aaattagactcgaagattgcaaacgctttgtctagcaacatcggttctattatagacgcaaccatatggatttcattaggactg 

113 KLDSKIAMAL SSNIGSIIDATIWISL G L 

agtccccttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttctagttcagtttatcttgcag 

S PLG I GTVAYIDI P SAVLGQVLVQF I LQ 



113 
56275 
141 

56359 tcaattgcttcgagatatttgaaaaagtag 56388 

169 S I A S R Y L K K 



dplORF038 ctaa ^ ccttaacatfccgacgcagctcatcw 

1 M R VSKTLTFDAAHQLVG.HFGKCA.NLHGH 

1434 acttacaaagtcgaaatttcattagcaggcggaacttatgaccacggttcgagtcaagggatggttg^ 

tykve I slaggtydhgssqgmvvdfyhv 



29 
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1518 aagaa a accgcaggtacatccattgacagacttgaccacgctgtccttcttcaag 9 gaatg a accaatcgctctagcaaatgca 

57 k kIAGTFIDRLDHA VLLQGNEPIALANA 

1602 gccgacaccaagcgagtcctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacggagctt 

8s VDTK R VLFG FRTTAENMS RFLTWTLTEL 

1686 atgtggaagcacgctcgtaccgactctatcaaactacgggaaactcctacaggttgcgcagaatgtacccactacgagattccc 

113 M W K HARIDSIKLWETPTGCAECTYYEIF 

1770 acagaagacgagactgaaacgttcaagaacgtaacccctatcgacaaagacgaaaagattactgcccgcgaaattctagagcag 

141 TEDE. IEMFKNVTFID KDEKITVREIL EQ 

18S4 gagcaggacaatggtcaa 1871 

169 E Q D N G • 

Sot""' acgaataaaagcgcaacccttcggcttgttcgaacagctcctattgcggctctatatgtgacattgaccgccgcatttcccgcc 

1 M NKS ATFWLVRTALIAALYVTLTVAFSA 

3390 actagctatggacctattcaatttagagtcagtgaagccttgattcttctacctttatggaaccatagatggactccggggact 

29 I S Y G P I QFRVSEAL I LLP LWNHRWT PG I 

3474 gtactaggaacaattattgcaaacttcttttcacctcttggactgattgacgttttattcggttcacttgctaccctccttgga 

57 VLGTIIANFFSPLGLI DVLFGSLAT F 

3558 gcagtggcaacggtgaaagttgctaagatggcaagtcotctatattcacttatctgtccagttcttgctaatgcttaccttatt 

85 VV AMVKVAKM ASPLYSLICPVLANAYLI 

3642 gcgctggaacctcgaatagcttactcttcacctttttgggaatctgtcatctatgtaggaattagcgaagcgattaccgtttta 

113 a l eLR IVYSLPFWESV-IY VGISEAIIVL 

3726 atttcatacttccttacttccacgctggcgaagaacaatcattttagaacactgataggagcgaaaaatgggatttaa 3803 

141 ISYFLISTLAKNNHFRTLI GAKNGI 

7192 RF04 ° gtgagccatactggaaaaatgttcgaggaagactttttcgaaggtgcaaaagactttgagaaagatgctttcacggtccgtcta 

1 v s YTGKM FBEDFF EGAKDFEKDAFT VRL 

7276 tatgataccactaatggatctcgaggagttgcaaatccctgcgattatatagccgcaactaactttgggaccttgtttactgaa 

,o vDTTNGFRGVANPCDYIA ATNF GTLF lt 

7360 ctgaaaactaccaaagaagcttccttgagctttaataacatcactgataatcaacggttccagctatcacgcgcagatggatgc 

= 7 LKTTKEASLSFNNITDNQWFQLSRADQt 

7444 aaactcatcctcgccggaattttagtgcacttccaaaagcacgaaaagattatacggtatccaatttcaagccttgaaaaaatt 

85 KFI LAGILV-YPQKHEKI I W Y P ISSLE Kl 

7528 aaacggtctggagttaaaagcgtcaacccaaacttcatcgatgcagggtatgaagtttcttacaagaagcgtcgaactagattg 

113 krsGVKSVNPNFID.AG YEVSYKKR R T R L 

7612 accattcctttccaaaatgttctagatgcagttgagcttcattacaaggagaaaagcaatggcaagacctaa 7683 

141 TIPFQ NVLDA VELHYKEK SMGKT* 

HIT™" atgcaaaaagacgtagacgtgaaaatgattgaccctaaacttgaccgattaaaatacacaggtgatcgggttgatgtacgaatt 

1 M Q KDVDVKMIDPKLDRLKYTGDW V D V R I 

8292 agtcccatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaagtatattcagtg 

,9 S S I T K I DAD SADVS RC'RKVLQ KAQVYS V 

8376 r ggca rn gca tt aaaa tt gcaca=^ 

ll 6 0 agtcttctcaagaaaactggtctaatcttcgtttctagcggagtgatcgacgaaggtcacaaaggtgacactgatgaatggtcc 

SL'FKKTG LIFVSSGVIDEGYKGDTDEW r 
8544 tcagttcggtacgctactcgtgacgcagatatcttctacgaccaaagaattgcccaatttagaattcaggaaaagcaacccgct 
113 s vWYATRDADIFYDQRIAQFRIQE K Q P A 

8628 atcaagttcaatctcgtagaaccttcaggaaaegcggctcgtggaggccatggaagtacaggtgacttctaa 8699 

141 IK FNFVESLGNAA RGGHGSTGDF 

Xot5*°" gtggcaaggcaaagaataggcaacccaggaaagcctaaaaatgaaattgaactaacattcaaagacaagcctaaaactcgttct 
1 vARQRIGMSGKPKNEIELTFKDKPKTRS 
48166 accttattcaagaaggacgtggcaacaggtcttccaaaagtcgagcatgattattttcaaatagttgaagcacttaacggaaaa 

TLFKKDVA TGLSKVEHDYFQIVEALNGK 
48250 caattcgaacctaatatgaagcaggtgtcatctttctttacagttcagtatgaatttattttcaatattaagcgcatcgattat 
57 QFEPNMKQVSSFFIVQYEFIFNIKCI Ui 

48334 aactggttcaacttttcgagcaccatgaaaaatgtccgaacttatttaaacatcgagtcgaacatcgaactttgtcgattttta 
85 M W F NFSSTMKNVRTYLNIESNIELCRFL 

4 418 gccgaaagttttgttaaatatgaaaatgttcgaaaaagatcgaacctaagcgaaaggttcacaacggtctcgacttccaaaaga 
113 aeS FVKYENVRKRLNLSERFITVSTFKR 

48502 gcccggattteggacgaactcgaaggaaaaacgggttcaaaattcgaaggatcttattag 48561 
141 awilDELEGKTGSKFEGFY* 



141 

dplORF043 



31699 atgaccaatattatcacagccgagcagcttaagcaacctgcatttcaaaccatcgcacctccaggattttcaaaaggtagtgaa 

1 M T M I ITAEQFKQLAFQI IALPGFSKGS8 

31783 cccatccatgttaaaatccgagcagcaggtgtcatgaacctaatcgctaacgggaaaacccctaatacgcttttaggtaaagtg 

29 P IHVKIRAAGVMNLIANGKIPNTLLG K V 

31867 acagaactgtccggagaaacttcgacagtcaccaaagacaatgctagtccagcatcaactactgaccaacagaagaaagaagcg 

57 TEL FGETSTVTKDNASLASITDQQKKE A 

319S1 ctcgaccgattgaacaaaaccgataccggtattcaagacatggctgaactt^ 

85 LDRLN KTDTG I QDMA E L L RV F A E A b n 

32035 cctacttacgctgaagtcggcgagtatatgacagatgagcaacttatgacaa^ 

113 PTYAEV GE Y M TDEQLMTI FSAMY.GEVTV 

32119 gctgaaacctttcgtacagacgaaggaaatgtctaa 32154 

141 AETFRTDEGNV*. 
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Sssf 044 atggcaagtgtttcgactagcagcagctcctttttgaagttcctgctccattttagctcgacaagtacttctaaaccgaataag 

1 MVSVLISSSSFLKFLLHFSSTSISKSNK 

25S82 gttttcaatttccctgtttcctacataagtggtgaaccgataatggcactcaggacattcgaagaacctccactctacgccctt 

" VFNFLVSYISGEPIMALRTFEESPLYAL 

25498 ctcgatatgtctcgaaataatctgtttagacgtaaggtcgaacttatgctcacaatggtcacaattaacctcgaacgtctgggt 

57 FDM FR N SLFRCKVELMLTMVTINLE R^ L G 

25414 cgactccttcctcggttggttgttcagcttgttctttttctttgtcatcaacttcgtcctctccactcgtctcatctcgaggct 

85 RLLLRLVVQFVLFLCHQLRLLHSFHLEA 

25330 cctcttgttcgtccaactcgtctgctaatacaggcaatgctccagccgagattccgtcaagctgagcaagttcttccaaaacgc 

113 pLVRLIRLLIQAMLQLRFRQAEQVLPKC 

25246 gttcccattccttgtccgccttttccttcttactga 25211 * 

141 VPIPCPPFPSY* 

dplORF045 a c agaagacgaagtcgatgacaaagaaaaagaacaaactgaaca accaaccgaagaaggagccgacccagacg 

1 MKRVKKT KLMT KKKNK LNNQ P KKEST QT 

25424 ttcaaggttaatcgtgaccaetgcgagcataagcccgaccttacatctaaacagattatttcgaaacatatcgaaaagggcgta 

29 FKVN CDHCEHKFDLTSKQ I ISKHIEKGV 

25508 gagcggagattcttcgaatgtcctaagtgccattatcggctcaccacttatgcaggaaacaaggaaattgaaaaccttactcga 

S7 EWRFFECPKCHYRFTTYVGNKEIENLIR 

25592 tttagaaatacctgtcgagccaaaatgaagcaggaacttcaaaaaggagctgccgctaatcaaaacacttaccattcatatcga 

85 frn T CRAKMKQELQKGAAANQNTYHSYR 

25676 attcaggatgagcaagctgggcataaaatctcagggcttatggcgaagotaaagaaggagataaacattgaaaaacgagaaaaa 

113 IQDE QAGHKISGLMAKLKKEINIEKREK 

25760 gaatgggtatctatatag 25777 

141 B W V S I * 

dplORF046 ^ tgtggct^acgacacagcagtcttgacgacgattattacagcgtgcagcggagtgcttactgtcctactaaataag 

1 MPMWLNDTAVLTTI I TACS G VLTVLL N K 

42858 teacccgaatggaaatcgaataaagccaagagcgttctagaggatatctctacaactcttagcactcctaaacagcaggtcgac 

29 LF EWKSH KAKSV.LEDISTTLSTLKQ Q V D 

42942 gggattgaccaaacgacagtagcaatcaatcaccaaaatgacgtcattcaagacggaactagaaaaattcaacgttaccgtcct 

57 0 1 DQTTVAIMHQMDVIQDGTRKIQRYR L 

43026 catcacgacttaaaaagggaagtgataacaggctatacaactctcgaccattttagagagctctctactttattcgaaagtcat 

85 YHDLKREVITGYTTLDHFRELSILFES Y 

43110 aagaaccttggcggaaatggtgaagttgaagccttgtatgaaaaatacaagaaattaccaattagggaggaagacttagacgaa 

113 KM LGGNGEVEALYEKY.KKLPI R EEDLDE 

43194 actatctaa 43202 

141 T I * 

47542 P047 atgaaacttgaagatgaaaaacagttcatcgctgcaattgaagaagccggtgaattaaatgctaccaaaggcgacatggag^ 

1 MKFEDEKQFIAAIEEAGELNATKGDMEK 

47626 caagtcaaaagtcttcgtgatgctctaaaagagtacatgaaagaaaatgacattgaatctgctcaaggtaagcacttttctgct 

,0 Q VKSLRDALKEYMKENDIESAQGKHFSA 

47710 accctctacacgacagagcgcccaactatggacgaagaacgcttgaaagaaattatcgaaaaattagttgacgaagccgagacg 

TFYTTERSTMDBERLKBI IBKLVDBA ET 

47794 gaagaaatgtgtgaaaaactttcagggotcatcgaacacaagcctgtcaecaatacgaaacttctcgaggacatgattcatcac 

85 EEM C EKLSGLIEYKP VINTKLLEDMIYH 

47878 ggcgagattgaocaagaagcaattcttccagcagttgtcatttctgttacagaaggcattcgctttggaaaggctaaaatttag 

III 61 G E I D Q E A I L P A V V I S V T E G I R F G K A K I * 
?67?r° 48 atggaaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcact 

1 M B TTLY FGY LTADW KD GH KNYT FHY ES 

16625 cctgtaaaagaaactgagaaacaatataaggtcactggaatcaatcctaacctgtacctagacctaggctcagctattagaaag 

PVKETEKQYKVTGI HPNLYIiDLGSvIR r 

16541 agcgaacttgacattgcagtattcaaagcatgccccgtcgctgaaaccggagtcacacttactcgcgacacggaagttgatgcc 

57 S ELDIAVFKACPVAETGVTLTRDMEV DA 

16457 agaattgaaaccatcaagaaattaactacaagaatcgaacgccttaacgaaagaattaaagcaagaaatgaacaaggtaaacaa 

85 R I B I I KKLTTR I ERLNER I KARNEQGKQ 

16373 gaaagccgccacctagtatctgcgctagaagattgcgctcgtcaaattgctggaatttatcaataa 16308 

113 I g R H L V S A L E D C A R Q I A G I Y Q » 
dplORF049 



44018 atgtttcaaccattcctcagcgagcacgtagccttggtcgtcaaagtagaaccaagacttgttttctccgatatactcgaactc 

1 M FOPFLSEHVALVVKVEPRLVFFDILEL 

43934 atcttttggataagttccgttcgctcgagcgtaccagaaaccagtagcatctttctgccagccaagtctcttctcagccggttg 

\V T FW IS SVCSSVPETSSIFLPAKFLLSRL 

43850 agcatttgcgttagccaagcgacagacgtagtagtaaggctgacctgcatagtaccaacgctcatcgtggtcgttgacggaaat 

57 sicVSQAIDVVVRLTCIVPTLIVVVu^ 

43766 tccgtcgtaggcgtagttgcagtgaatgatgttatcactgtcaatgaacatccctgtacgacctccagcgcctgcgctagcacc 

8S sv vGVVAVNDVITVMBHPCMTSSACAa 

43682 tttgcgtccccagacgaagatgtcgcctcgtttagcatcccacggagcattttcaccaattag 43620 

113 FASPDEDVA S' F SIPRSIFTM* 

2oU ?OS ° atgaacaatcagcgaaagcaaatgaacaaacgaatcgtcgaaettcgcgaagactatcaacgtgcaag-aggtcgaataaactcc 

l MHNQRKQMNKRIVELREDYQR ARGRIN' 
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15165 cttcttgctgtaaaggaccacggcgaagaactcgaaaaccttgaagcctttgtgggatacattgacaatctagtcgaatgtttt 
29 LLAVKDHGEELENLEAFVGYIDNLVECF 
1524 9 <=ctgaaagccaacgaaatgtcttgaggctatgtgtattagatgaccttccagtcactaatgcggccgctgaaattggataccac 
57 PESQRNVLRLCVLDDLPVTMAAAEIGYH 
15333 tatacatgggttcaccaacttcgagacaaagcagttgaaacacttgaagaaattttagatggggataacattattcgctctaaa 
85 YTWVHQLRDKAVETLE E I LDGDNZ IRSK 

15417 cacggaatcgaaattaaggagaaacttgatgaattatatggtaaaagtcattctagttag 15476 
113 HGIEIKEKLDELYGKSHSS* 
dplORF051 

29765 atgagttatgacgtgaattatgttaagaatcaagctcgtagagccattgaaaccgctcctactaaaatcaaggtacttcgaaac 

1 MS-YDVNYVKNQVRRAI ETAPTKI KV LRN 

2 984 9 tcttgggtcagtgatggatatggaggaaagaaaaaggataaagcgaatgaagtcgtagcagacgaccttgtttgtttagttgat 
29 SWVSDGYGGKKKDKANEVVADDLVCLVD 
29933 aattcaactgttcctgaccttttagccaattctactgacgcgggaaaaatttttgcccaaaatggagtgaaaattttcattcta 
57 NSTVPDLLANSTDAGKI FAQNGVKIFIL 
30017 tatgatgaaggcaaaatcattcaacgagccgatactatcgaaattaaaaactcaggaagacggtacagggtagtagaaacccac 
85 YDEGKI IQRAD TIEIKNSGRRYRVVETH 
30101 aatcttctcgagcaagacattttgatagaacttaaattggaggtgaacgactaa 30154 

113 NLLEQDILIELKLEVND * 

dplORF052 

30516 atgactaaacgaacgacaatgatggacagattgaaggaaattcttcctacatttcagctctcgcctgctcctatgcttccagga 

1 MTKRTTMMD R LK E I L PT FQLS PAPMLPG 

30600 gttgaatttgacgagcaagatacagataggccggatgactacattgttcttcgatatagtcatagaatgcccagcgcaacaaat 

29 VE FOEQDTDR PDDYI VLRYS HRMPSATN 
30684 agcctaggaagttttgcttattggaaagttcaaatctacgtccattcaaactcaattattggtatcgacgaatatagcagaaag 

57 S LGS.F AYWKVQ I YVHSNS I IG IDEYS RK 

30768 gttcgaaacattatcaaggacatgggctacgaagtaacctatgcagaaactggtgactacttcgacacaatgctttctagatac 

85 VRNI I KDMGYEVTYAETGDY. FDTMLSRY 
30852 cgactagaaatcgaatatagaattccacaaggaggaaactaa 30893 

113 RLEIEYRI PQGGN* 
dplORF053 

50300 atgctaacattcgaaagaatagtatctatacgagcaccaacttgcatttcactcatttccccgctatatagaaggacatcatgc 

1 MLTFERIVS I RAPTCISLISPLYRRTSC 

50216 ccgttcttccaagcagttgcaagcattttatcaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaaatcc 

29 PFFQAVAS I LS IVH DLPCPGRAIMTI KS 

50132 tcaccaggaagtaagcctccaagcacgtcgtccaatagttcaaaccctgtcgatattccaagtctttcaccgtcatggtttcta 

57 SPGSKPPSTSSNSSNPVDIPSLSPSWFL 

50048 atagtattcgcccagtctagtcgaagtttagcatttcgagcaatgtctagtccgcctacgaatttagagcgattgaaaagttct 

85 I VF AQSSRS LA F RAMS 'S PPTNLERLKSS 

49964 tctagttttggaattatattcgcaatcgcaatgttactatctacttga 49917 

113 SSFGI IFAIAMLLST* 
dplORF054 

144 23 atgtgtgaaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcattcact 

1 MCENCQN ETFNTRIFNED ESGYVDASFT 

14S07 tacaaggagattcgcgacaccgcagcagctattagcaatcgagcggtagaaaagaaagaccgtgacagccttttagtcgctaca 

29 YKE IRDTAAAI SNRAVEKKDRDSLLVAT 

14591 gttatggctcttcccgtttctcacgcagaagatttaggcaagagactttgtattgcaaattctcgattggaagcatttcgtgaa 

57 VMALPVSHAEDLGKRLC I A N S RLE A F, RE 

14675 gctgctcaagaggctctcgagaatgaaaaggctgaagatttaaaggacgttatcttaggtcttatcgacgttgacaaaaaaatt 

85 AVQ EAL ENE KA E 0 L KDV I LG L I D V D K K I 

14 759 ggcaaccttgcattgcaattagttgaatcaggagcattataa 14 800 

113 GNLALQLVESGAL* 
dplORF055 

27627 atgcctaatgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttg 

1 MPNVRVKKTDFNQTTRS IVAI PDHYVAL 

27711 gctgctcaaattccagctaccgcagcaactcaagtagggaacaagaaatacattcttgccggaacttgcgtgaaaaatgctact 

29 AAQ I PATAATQVGNKKY ILAGTCVKNAT 

27795 acatttgaaggacgcaaaactggactcgaagtagtatctaccggtgaacaattcgacggagttatcttcgctgaccaagaagtg 

57 TFEGRKTGLEVVST GEQFD GVI FA DQEV 

27879 tttgaaggtgaagaaaaagtaaccgtgacagtattagttcacggattcgtcaaatatgcagcccttcgaaaagttggcgatgct 

85 FEGEBKVTVTVLVHGFVKYAALRKVGDA 

27963 gtgcctgaatctaaaaacgcaatgattcttgtcgttaaatag 28004 

113 VPESKNAMI LVVK* 
dplORF056 

19151 atggaaaataaatggaaagttatccatcttcaaaactcatgtattaaacaagtagacgatgaaaaaaggaggctcctgttcgaa 

1 MENKWKVIHFQNSCI KQVDDE KRRLLFE 

19067 gttccaggaactccttatcgtctacaagcttgggtgaaaatgagcttagttaaaattgaaacacgcgcaggaaatggctattat 

29 VPGTPYRL QVWVK MS LVKI ETRAGNGYY 

18 983. aaaaggctagtatgccaagacgattttgtattttatggtaaggagtcaatagatggttacttaattgacgccaccataactggc 

S7 KRLVCQDDFVFYGKES IDGYLIDATITG 

18899 aaacctttggcggaatattgcgagcctatgaacaggcatattctcgaaactattgcatcgcgagaagcagctgaactgaacaga 

85 KSLAE .YCEPMNRHILETIASREAAELNR 

18815 gctaaaaagcaagaccaacagaaatggagatactag 18780 

113 AKKQDQQKWRY* 
dplORF057 
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9859 atgcaaaaatctctatttggacctaagctagtgcctgctagttcaaggcgcaagaaaagaacggttccaaaacctaaacctaaa 

1 MQKSLFG PKLVPASSRRK.KRTVPKPKPK 

994 3 atcgatgagcaagtggtcgagcttatgaaccgcagagagcgtcaagtgcttgttcatagttgcacctatcattattttaatgac 

29 IDEQVVE LM NRRERQVLVHSC I YYYFND 

10027 tcaattatagcagacgggcagtatgacaaatggagccacgaactatactctcttatagtttcgcaccctgatgagtttcgacag 

57 SIIADGQYDK. WSHELYSLIVSHPDEPRQ 

10111 actgttctctataacgagtttaaacagtttgacggaaatactggaatgggtcttccatacgactgtcagtttgctgtaagggtc 

Q5 TVLYNEFKQF DGNTGMGLPYDCQFAVRV 

10195 gcagaaaggcttttaagaaaatga 10218 

113 AERLLRK* 
dplORF058 

15633 atgacatcacgcgcatacaaaccaatccccacgcgcagagctagtgctaaacaagagaaggcagttgctaagcagttgggagga 

1 MTSRAYKPI PTRRASAKQEKAVAKQLGG 

15717 aaagtacagcctaattcaggagccactgactactacaaaggtgacgtcgtaacagactcaatgcttatagaatgcaagacagtt 

29 KVQPNSGATDYYKGDVVTDSMLIECKTV 

15801 atgaagccacaaagttcagtcagcttgaaaaaggaatggttcctaaaaaatgaacaggaaaggttcgctcaaaaactcgactat 

57 MKPQS S V S L KKEWFLKNEQE R FAQK LDY 

15885 tctgctatcgctttcgactttggtgacggaggcgaacagtatatagcaatgtctataagtcagttcaagcgaatattagaggat 

85 SA I A FD F G DGG EQY I AM S I S Q F KR I L ED 

15969 agaaatgataaccttatttaa 15989 

113 R N D N L I * 
dplORF059 

30154 atgtctcagcctgaattagtatggaagcctgaagaatttgttagtaactgtgaacggtatcgaaacaagtttcaagtcgctgtc 

1 MSQPELVWK PEEFVSNCERYRNKFQVAV 

30238 ataacagtctgcgaagtcgctgctactaagatggaagaatacgcaaagacgcatgctatttggacagaccgtacagggaatgct 

29 I TVC EVAAT KM E EYAKT HA I W T DRTGNA 

30322 cgacagaaactcaaaggagaagctgcttgggtaagcgcagaccaaatcatgatagctgtatcacatcacatggactacgggttt 

57 RQKLKGEA AWVSADQI M IAVSHHMDYGF 

30406 tggctagaactagctcatggtcgaaaatacaaaattctcgaacaggctgtagaagacaatgtcgaagaactttttagagcgttg 

85 WLELAHG RK YKI L EQAVE DNV E ELFRAL 

304 90 agaaggttattagactag 30507 

113 R R L L D * 
dplORF060 

38070 gtgatagctgtatctgctatccctactccgctctttccaggtacaccgtcgactccatcacgcccaggagctcccggtaaacct 

1 VI AVSA I PT P L FPGT P S T P S R PGAPGKP 

37986 gcgtcacctttaggaccttctagtcgaatccatgtaaagtcgtcaggaactaattcgctcggtttcttattagtattaaggaca 

29 AS PLG P S S.R I HVKS SGTNS LG F L L VLRT 

3 7902 ccaatgtatttcccagattctgcattaaaattagtccctaaaatgtcatctgcgtatctaataacaacttgggactcatttaca 
57 PMYF PD SALKLV PKMS SAYLI TTWDS FT. 
37818 gtttcccctgaaaggactccttcgccgtcctcatttagcaagtccatcaagtcttttcgagggtcttggaaaatgatagtagag 
85 V SPERTPSPSSFSKSIKSFRGSWKMIVE 
37734 tttgaaaggtcgtcgtag 37717 

113 F E R S S * 
dplORF061 

19475 atggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaaaatgaaattcgaagtttattctgcgcgacta 

1 MARMQR LC P MKFWKAVT KMKF EVYSARL 

19391 tttgacgaagaggcgacatatgataggtatcgtgaagcactagagaaagttggaaatgtcgcttacttttgtgaaattgatact 

29 FDEEATYDRYREAL EKVGNVAYFCEIDT 

19307 ggcaaccttgtaatcgaactcgagctagacagcctagatgacctaatcgcgctttcaaatgtagtgggaactggactaaaatta 

57 G NLVIEL E LDSLDDLIALSNVVGTGLKL 

19223 tcacggccttatagagaagataagccttttcaattatggattgttgacgggtacatggaataa 19161 

85 S R PYR E D K P FQ LW I VDG YME * 
dplORF062 

4 5284 gtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttgacgagtttaaaaattccgtcaatcgcccattcgta 
1 VRSFNQFHCGVNIFFLDEFKN SVNRPFV 
45200 agatgcaggagcaatagatgcaagaagtttcttttggtcttctgtcaacccttttgcgcgaactccaatagaaacaccttttcg 
29 RCRSNRCKKF LLVFCQPFCAN SNRNTFS 

4 5116 agtttcttcgatagtaacgaagttcttcttagagcgataggcgatgtaagactgagtgacgattcgagtagacgcaggaaaggc 

57 SFFDSNEVLLRAIGDVRLSDDSSRRR K G 

45032 ttcaacaattcgactttcaagagccttagtaatcgccatcacgctttcttttttaggagcaggttttcgaacagtagatttctc 

85 FNNSTFKSLSNRHHAFFFRSRFSNSRFL 

44948 actaactga 44940 

113 T N * 
dplORF063 

47200 atgaaattcactgaaggaaaaaattggtataaagttggagagatatgtcaaatgttgaaccgctctctatctacgattaatgtt 

1 MKFTEGKNWYKVGEICQMLNRSLSTINV 

47284 tggtatgaagcaaaagacttcgctgaagaaaataacattcacttcccgtttgttcttcctgaacctagaacagaccttgaccat 

29 WYEAKD FAEENNIHFPFVLPE PRTDLDH 

47368 cgtggttctcgattctgggatgacgaaggcgtgaacaaactcaaacgatttagggacaacctaatgcgcggtgacttggcattc 

57 RGS RFW D DEGVN KLK R F RDNLMRGDLAF 

47452 cacactcgaactcttgtagggaaaactgaaagggaagcaattcaagaagatgctaaagcatttaaacgtgaacatggattggag 

85 YTRT L V GKTEREAI QEDAKAFKREHGLE 

47536 aattaa 47541 

113 N * 
dplORF064 
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29108 atggccacattgaa a gctcttagcaccttaaccgttccc9gagcagtagtgcactca 9 ggtcggtattttctt9ccctgaagcg 

1 M ATL KALSTLIVSGAVVHSGSVFSCPE A 

29192 cctgccccgtccttaactgaacgcaacttcgcgttcgagateaaggcggccgaagatggagaaacggtagaaactgttcctcaa 

,9 laS SLIERM FAPEIKAAEDGETVETVP Q. 

29276 acaactgaatcagttgaagaaattgacgaagttgaacaaatgcgcgaagagtatgcggccaaaaccgcccccgagctcgttgaa 

s7 TIESVEEIDEVEQMREEYAAKTVPELVE 

29360 ttagcaagagccaatggaaccgacatftctecaatttctcgaaaaagcgaatacatcgacgctttaattaagtacgaactagga 

35 LARANGIDISSISRKSEYIDALIKYELG 

29444 gagcaa 29449 

113 E * 

til™ 0 " atgcagttcgtcataacctacatcaaacatctcgatgagcccgtccgtcaatttccgttcacacatataaggatgaacaaaccg 

l M Q F V I TV I KH LDELVRQ F P P I H I RMNK P 

514X3 gcatctaccaagttcctcttcaggaatgattttacgctcgactctttcagttctcccatttcttcgaaaogcttcagggccgac 

2= vFIKFLFRNDFMLDFFSSPISSKRFRAU 

51329 gccttgcctaactactccgctagatgttccaaaattccttttcagccactggtttccatagaaccctccatcgtttcgacctaa 



5124S 
57 



ALPNYFARC SKI PFQPLVS I E P S I V S T 



5K5"" gcgaccaactgcgtcaggtggaagcaataccactttaccgtcgtcaatcaagttgaactgacgaatgttaccaacgtcaggaag 
^ v tNCVRWKQYHFTVVNQVELTNVTNVRK 

28814 cttgtcagcgtcagcgaactgagcaattttcttagagtagacagcgatttgaagacctgtttcttcagcgatgaatttctcagc 

fvsvselsnflrvdsdlktcffsdefls 

2 730 gccactcgcaagaagcaagaagtcttcccaagaacctcgaacaccaattgcaagagctttctcgatagagtcactcttagtcac 

c, vtckkqevfp rtlmtncksfldrvtlsh 

28646 ctggtcataagtgttccggttcaagaccatccgagtagggcgaacacccgtacgaccttcgatgtcatccattgctgccaa 

ll S66 L V I SVSVQDHS S RANTCT I F D V I H C C • 

JSsi 7067 gtgacgattcgagtagacgcaggaaaggcttcaacaattcgactttcaagagccttagtaatcgccatcacgctttctttttta 
1 VTIRVDAGKASTIRLSRA LVIAITLS r u 

44977 ^cagget-ga^ 

44893 ^^^^ 

44809 gcagcttcctcttttttaggttcagtttcatcttccattgtgtaccaacgttcgagagttgaagctgaaaggtga 44735 

85 aassflgsvsssivyqrsrveaer 

S'sf 068 atggcagctcaaacggacattgaattagtcaaaatcaatatcgataacgataattctccgtcaccaatgactgaccaaagtatc 
^ maAQ TDIBLVKIMIDNDNSP SPMTUWoa 

29703 cttgaaactgatgaaaagtcgaacgctggttcgacaatcttaatgaaaagggctgatgggacatga 29768 

85 L ixDEKSMAGSTILMKRADGT* 

?o\ll F ° 69 atgaaactttatcacgccactgattttgataatcttggtaaaattctagctgaaggaftgaagccttcagctggagttatttac 

i M KLYHATDFDNLG KILAEGLKPSAGVl l 

,0327 c^gcagaaa^gaa^ 

20243 atcgaaaaatgtactgaaagtttcgaccataatgaaaagatgttttgtagcctatttcatttcgacacttgtcgcgcttggact 

S7 jIkctESPDHMEKMFCSLFHFDTCRAWI 

201S9 tatgacaagacaattgaagtagacgacattgacttttcgaaagctcgaaaatatgatagaaagtga 20094 

85 ydKTIBVDDIDFSKARKYDRK* 



-0S7 cctatacaagagg^ 

"l41 cggagccgtgtaggagtttcaaaatatggtacaaacctcgaccagaatgatgtcgacgatttcctacagcacgccaaagaagaa 

57 R S R VG VSKYGTNLDQHDVDDFLQHAKEE 

16225 gcgctcgaccttgctaactacctaaccaagctacaaagtcaacaaaagcaaaataaatag 162 84 

85 ALDFAMYLTKLQSQQKQNK* 

^"^cgaaacagg^ 

1 gctcgcaatatactcacctcgctttctctaatagtccaaacggtgagggatttagtcacactgacagcggacgagcacacgtcg 
VR NILTSLSLIVQTVRDLVILTADEHTS 
gtcagtatcaagatttcaatcccgtccattcaaaagaccctgcagcctacacacggacgaaatggaaggggaatgacggagctc 
51 VS IKISIPSIQKTLQPIHGRNGRGMTELi 

39156 aagggatacccgggaagccaggcgcagacggtaagactaattattcccatatag 39209 
85 KGYPGS QAQ TVRLI I S I ■ * 

til 0 ™ 012 acgttccttcgtcctcaagttgtctcgaaagtttttcaatcatctgttcaggagtcgcttcaatctgaagaccatttactccca 



38988 
29 

39072 
57 
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1 MFLRLQVVSKVFQLFVQESLQFEDHLLS 

50961 tcaaaatgcttcaactccttcccttgtaaccttacttcgaagacgagcagtcgacctagaggcttttgctttcaatggagagct 

29 skcFNSFPCNLTSKTSSRPRGFCFQWRA 

50877 ttcgcctttttcagttccttcttcgccttcctctttgaatcctataagagtataggttccagtttcaacgtcccacatatattc 

57 FAF FS S F FA FLFESYKS IGSSFNVPH IF 

50793 gatgatttttcggtcttcgccatatcggtttttaacgacagatag 5074 9 

05 DDFSVFAISVFNDR* 

14262 F ° 73 qtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaacagaaaaaccta 

1 v mACRK NTTKKLGNLSLKQNTSS EQKNL 

14346 aagcagttgcaaaacctactcgaaaaactccagcgccttctcgtcgccctcgcccttaaaagaaaggttgaaataaaatgtgtg 

29 K QLQNLLEKLQRLLVALALKRKVEIKCV 

14430 aaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcattcacttacaagg 

57 K I V KT KH S I L EFSMKMKVAMST PHS.LTR 

14514 agattcgcgacaccgcagcagctattagcaatcgagcggtag 14555 

85 RFATPQQLLAIER* 



dplORF074 

32298 gtgacgaaaagaaaaatccaggattgcaaatgcttatggagtgactactttcagtcgctcctctttttgtatatagaaaggaaa 
1 VTKRKIQDCKCLWSDYFQSLLFLYIERK 
32382 ttacatggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaatcaaatcttgctcaaagtca 
~~ LHG FWVNCS KNDFGYLKLHKS I KSCSKS 



29 LH GPHVMCSICHDFOYIiKli«*.a-**'«*---*- 

32466 agcgcaacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgc 

57 S A T A R TRVFEVLSNWFCFNRIRERT YDC 

32550 ggttacccttcctcttatgggatctgcagccgcctctattaa 32591 

85 GYPSSYG ICSRLY* 



atggcaaagttttgtccgttgaattccgtcatggcccaaagggaaaatgaaagagccatcgatactgtttttcctgaacgaatg 

M A K F C P LNSVMAQRE NERA I DT VF PE R M 
gaaccgtctgctatgacgatatcgaaagttcgaaaaggtgagccctttgtccaccatgttaggagctggagttgtttcttacta 
E PSA M T I S K VRKGEPFVHHVRSWSCF LL 



dplORF075 
22447 
1 

22363 

29 E psAMTISK.VKNUftrrvn«»«~»-- - - - 

22279 aaagggacgaagttgaacttaggtagtttatttctcaggcttattgtcattatcagtcactcctttaatgtaggaacctgttgc 
57 K G T K L N lgslflrlivi ishsf.nvgtcc 

22195 gtcactaaattcttgccaaacggctt'gagctgccttatctag 22154 
85 vTK FLPNGLSCFI* 

dplORF076 
5728 
1 



5644 



qtqagagcattttcttcactcacgtcttcgagcaagtggtcgaatgtagggtactcttcatcttctgtaacaatatcaatattg 
V R AFSSLTSSSKWSNVGYSSSSVTISIL 
tactcaccattcccaataacttttagcgaagattcttcaggaactaatgtgacggttgcggccgtggtcttttctacaagtttt 
29 yspfpitFSEDSSGTNVTVAAVVFSTSF 
5560 ccaaactgccctgctttcacaatcacgtcaatttcaacatcgctgtcgataatgcatcgaaggaagtttgagccatcatacgct 
57 PNC S A FT I T S .ISTS L S I MHRRKFE PSYA 

5476 gtaaacatgacgcattcgccgtcaccaaaaatatgccaatag 543 5 

85 VNMTHSPSPKICQ* 

14800 F ° 77 atggaacgaataaagacgctatttcacgtgatttatgctaacggcactcatttagaagtagcagctttgttcgataccgttgat 
i m er iktlf hviyangthlevaalfdtvd 

14884 gattatgatgacgttatagaggacatccaggggtatattgatacccctgacctttataatcaaaggagcattagaatggcgcct 
29 dyddvi ediqgyidtpdlynqrsirm ap 

14968 tacaatcctgacatcaatggtgacgctattgctactgacattttactacgactagatgatattatctacgtcgacgcaa^ 

S7 ynpding daiatdillrlddiiyvdatc 

15052 gaaactattaaatacgaggagcctattgcatga 15084 

85 ETIKYEEPIA* 

dplORF078 ^ ^ agtaaa ggaaacagtaaaattt g acggacgtcttgtaactatcttcgactacgacgatttagagt g ggaaggatat 

1 maTVKETVKFDGRLVTIFDYDDLEWEGY 

17423 gcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagtgg 

29 apheGF EDVEDMEVL SIR VRNEGEDDEW 

17339 gttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataa 17280 

57 VEVIACYENDDEDEDLEGL* 



35288 atggaactgataccattgataaatcctcgaacaag^^ 

! m eliplinpr-trltpalticpanpv tle 

lattgaagttcccatgctgccaattttagagacagctgaaccaatcattgacccaataccactaatgaagtttcgaatcagg 
IEVPMLPILET AEPIIDPIPLMKFRIR 



IEVPMLPIIjB* AC, ^ AAL,r 
tcgcacctcctgaaaccatctgtcccacaaagctagcaatcttgctaactaatgatgaaagcatgtttccagctgtcgataaa 

APPETICPTKLAILLTNDESMF PAVDK 



M E L I P L I N P R T R T P A L T ICPANPVTLE 

35204 acaatt 

29 T 

35120 ttc 

57 F - - 

3 5036 agtgagccgagaagtgaagcaataccttga 35007 

85 SEPRSEA IP* 
dplORFOSO acctcacaaaatcgcgcca ^ 

1 MLNLT KS RQIVAEFTIGQGAEKKLVK T T 

42574 attgtgaacattgatgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaa 

29 x v N I D A N A V S T V S E T L H D P D L Y A A N R R B 

42658 cttcgagctgacgagcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctagctgaacagtcaaagactgaaaca 

57 LR ADE QKLRETRYAI EDEI LAEQS KTET 

42742 gctctaacagctgaataa 42759 
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85 A L T A E * 
dplORFOSl 

55466 atgttcaggaacagtatcgtccaCctgttggtctgcgtcaaagttaaaggggtcgaaatcttcgttcttgctagcgtcgataca 

1 M FRNS I VHL LVCV KVKGVE I F VLASVD I 

55382 ctcgaactcgtattcaggaagactcatatcaggaagccttcttcttcgaccggtagctgtttgaacatatcccaagtcctgcgc 

29 LELVFRKTHIRKPSSSTGSCLNISQVLR 

55298 ctgctgttgaacgaatatgatatagtctgccactttagggaactcggtgaagaaatcttcaataaccttattcgcttctttgac 

57 LLLNEYDIVCHFRELGE EIFNNLIRFFD 

55214 agatacattcatctgctcagcgattga 55188 

85 RYIHLLSD* 
dplORF082 

44 728 gtgaacttcacctttcagcttcaactctcgaacgttggtacacdatggaagatgaaactgaacctaaaaaagaagaagctgcta 

1 VNFTFQLQLSNVG TQWKMKLN LK KKKLL 

44 812 aacctgctaaaaaggctgctcctgcagttgctcgacctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttg 

29 N-LLKRLLLQLLDLLEKVESFPNLKKKSL 

44 8 96 aggaagaaattcctgaagttaaggaacagccggaagaagttggttcagttagtgagaaatctactgttcgaaaacctgctccta 

57 RKKFLKLRNSRKKL VQLVRNL LFENLLL 

44 980 aaaaagaaagcgtga 44994 

85 K K K A * 
dplORF083 

3 5 974 atgcct'ccagggtttttaaatcctgagtccttaaatcctgcgaaagtgagtcctacatattctagcacggttgcacctttgtcg 

1 MPSGFLNPESLMPAKVS-PTYSSTVAPLS 

3 5890 acaaggtcaattccgtcgaccaatagcgtctgtctgctagccatctatttctcctttacggtgttacaatgttaccaaaccctg 

29 TR S I P STNSVCL L AI YFS FTVLQCYQTL 

3 5806 atagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaactacgattgttccaatgt 

57 I EFLYFYYTILSTV CQRRHC FELRLFQC 

35722 tga 3S720 

85 * 
dplORF084 

15445 atgaattatatggtaaaagtcattctagttagtgtctttgtactgtcagccttttgcatgacttgctcaatggtttatttggtt 

1 MNYMVKVI LVSVF VLSAFCMTCSMVYLV 

15529 acaggcaagcaagaggaccaccgtagtaccgtcgcccttgtatctggcgctctcgtaagctctgcggcgttctattcgacactc 

29 TGKQEDHRSTVALVFGALVSSAAFYSTL 

15613 tttatcctcgcctatctgccatga 15636 

57 FILAYLP* 
dplORFOSS 

1084 7 gtgatgactataatcaaggactttttcgagccttgtgatactgtcacgcattcctccatttgcaagtttcccaataaacgaaag 

1 VMTI I KDFFEPCDTVTHSSICKFPNKRK 

10763 ggcgtcacgctcataactataaccagctccttcttcattttcactttcgataataaattgaagttgattaacgatgtcgtcatt 

29 GVTLITITSSFFIFTFDNKLKLINDVVI 

10679 atcaattcgagtaaagtcaaaccgttgaactcgactgagaatagtgtcaggaatcttttgagggtcagtagtacatag 10602 

57 I N S S K V K P LNS TEN S VRN L LRVS S T * 
dplORF086 

52760 atatgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtg 

1 I WE KYQFKNQEHLAQGLITSFSHSLTTV 

52844 acagcgcaattgtcactctattgcatgatgacccggaaggcgaagacatggattatttcgtag 52906 

29 TAQLSLYCMMTRKAKTWI IS * 
dplORF087 

30036 atgattttgccttcatcatatagaatgaaaattttcactccattttgggcaaaaatttttcccgcgtcagtagaattggctaaa 

1 MI LPS SY RMKIFTPFWAKIFPASVELAK 

2 9952 aggtcaggaacagttgaattatcaactaaacaaacaaggtcgtctgctacgacttcactcgctttatcctttttctttcctcca 
29 RSGTVELSTKQTRSSA TTSFALS FFFPP 
29868 tatccatcactgacccaagagtttcgaagtaccttgattttagtaggagcggtttcaatggctctacgaacttga 29794 

57 Y P S LTQ E F RSTLI LVGAVSMALRT * 
dplORF088 

5040 atgaaaaaagttcaaacttatcaagaatatctaaaactagttgagttcaaacgtcaactttctttaaatcttcgagaaggaaaa 

1 MKKVQTYQEYLKLVE FKRQLSLNLREGK 

5124 ataggagtcgatgaagcggttattcaattattcaccttctatagcttcaacaatatcgaggaacctcctttcattgtactcaaa 

29 I G VD E AV IQLFT FYS FNN'I EE P P F I VLK 

5208 atgcaagaggctgccgtgaacgggacttatgaagcaaaactcaatatgcttaaaagatttaaaattatttag 5279 

57 MQ EAA VNGTYEAKLNMLKRFKI I * 
dplORF08 9 

124 95 atgtcaatcatgtcgctaccaatagtcgagtatttagacacaaaatgccttttcaactgcgcgtcagtcattttctcaaactca 

1 MSIMSLSIVEYLDTKCLFNCASVIFSNS 

12411 acacaattatcaggaaaggcctttagcaacttgcttcgcttgtcaattttagtaaccatcaaaacaagtgtcccacatctaaca 

29 TQLSG KAFSNLLRLS I LVTI KTSVPYL T 

12327 tccggaagccttttccacctcgactcattagacagaaactccttatcatctcgaacagcgaatattcgatga 12256 

57 SG S LFHLDSLDRNS LS SRTANIR * 
dplORF09 0 

27037 atgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaacagcgcgatgcag 

1 MLKFSLTATVNILYLTHVSMKLFNSAMQ 

27121 ctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcgcagaccacta 

29 L TAQLI LIKNKSRRFLNRSKITV-MRRPL 

27205 tccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtga 27261 

57 SKT. FKSNSTSSLNLQKAL* 
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dplORF091 



43189 atgaaactatctaacgaacaatatgacgtagcaaagaacgtggtaaccgtagtcgttccagcagcgattgcactaattacaggt 

1 MKLSNEQYDVAKNVVTVVVPAAIALITG 

43273 cttggagcgtcgcatcaatctgacactactgctatcacaggaaccattgcacttcttgcaacctttgcaggtactgttctagga 

29 LGALYQFDTTAITGTIALLA T FAGTVLG 

43357 gtttctagccgaaactaccaaaaggaacaagaagctcaaaacaatgaggtggaataa 43413 

57 VSSRNY. QKEQEAQNNEVE* 
dplORF092 

46989 atgaaaactatctccatattaaggaaagacactaaaaggaagccggacaggaacggaagaaaaactgcactcgaactagctcaa 

1 MKT I S I LRKDTKRKPDRNGRK TALELAQ 

47073 gagattgatatgtcacctagtgagttagcagagctccttcaaattcctgaaaggacggcaaccagaattttaaaactcgacaaa 

29 EIDM SPSELAEL LQI'PERTATRILKLDK 

47157 ctgctcaacaaagagcaatgctcaataatagaaaggtatataaatgaaattcactga 47213 

57 LLNKEQCSI IERYI NEIH* 
dplORF093 

45756 atgcaacatacgattaaacaatgtttgaaacttgccttcctgctaactgcaatatcaattgcctgtttagttttccctaaacct 

1 MQH TIKQCLKLAFLLTAIS IACLVFPKP 

45672 tgctcatcgcctaaaaggaaacatggatgctcttgtgcgtattcgaaacattcaacctggtgcgcgaatggagtagtcttgaac 

29 C SS PKRKHGCSCAYSKHSTWCANGVVLN 
45588 . gaaaactgctcattgcttgaagaagctattcggtttcgagagtcaatgtag 45538 

57 ENCSLLEEAIRFRESM* 
dplORF094 

8281 atgtacgaattagttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaag 

1 MYELVLSLKLTPTAP-MSQDVEKCFKRLK 

8365 tatattcagtggcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgc 

29 YIQWRQVNALKLHTDLLLNFLR DMKQSC 

8449 atcctcgttccagtctttttaagaaaactggtctaa 84 84 

57 I LVPVFLRKLV* 
dplORF095 

8877 gtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaagactgaagaac 

1 VGKLLQLSTLS RMRKW YLSRNGNRRLKN 

8961 tcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactcgatagtcttc 

29 SRKSWKMRVHPKLARLLSRNLKCNSIVF 

904 5 aagagcctcttaagattgtatatcttgaccttgagaatacattag 9089 

57 KSLLRLYI LTL RIH* 
dplORF096 

46681 gtgattcacaaattcttcaatttcgttgaacttatctgcggtttctcctgttaccaggttgcatttgactgtcttcgaaagtat 

1 VIHKFFNFVELICGFSCYQVAFDCLRKY 

46597 cttagcaagaggttcaataaccttttcccaattgctaaatatcacgcaggactttccttgctggatacattcctcgacaatttc 

29 LSKRFNNLFPIAKYHAGLSLL DTFLDNF 

46513 gatacatctttcgaacttgcaagacttgacatcttgagtagttaa 46469 . 

57 DTSFE LARLDILSS* 
dplORF097 

3 9100 atggacgggattgaaatcttgatactgaccgacgtatgctcgtccgctgtcagtatgactaaatccctcaccgtttggactatt 
1 MDGX EILILTDVCS SAVSMTKS LTVWT I 
39016 agagaaagcgaggtgagtatattgcgaacgtccgtcagctcctgcaggtccaggaattccttgaagcccttgaggaccttgaag 
29 R ESEVSI L RTSVSSCRSRNSLKPLRTLK 
38932 accttgaactcctctaggacctgtttcacctatcttggaaactga 38888 

57 TLNSSRTCFTYLGN* 
dplORF098 

43627 gtgaaaatgctccgtgggatgctaaacgaggcgacatcttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcata 

1 VKMLRGMLNEATSSSGDAKVLAQALEVI 

4 3711 cagggatgttcattgacagtgataacatcattcactgcaactacgcctacgacggaatttccgtcaacgaccacgatgagcgtt 
29 QGCSLTVITSFTATTPTTEFPSTTTMS V 
43795 ggtactatgcaggtcaaccttactactacgtctatcgcttga 43836 

57 GTMQVNLTTTSIA* 
dplORF099 

3 82 98 atgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtatttat 

1 MQVRHLLLKLQLVDGLR KFLPSQVVS I Y 

383 82 ggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcctaaag 

29 GLEQDGATLTKLMKLDIQFQEWASRVLK 

38466 gtgacgcaggtcgtgacggtattgcaggaaagaacggaatag 38507 

57 VTQVVTVLQERTE * 
dplORFlOO 

1597 atgcagttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacgg 

1 MQLTPSEFYLDLELRLRICQDSLPGLSR 

16 81 agcttatgtggaagcatgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgaga 

29 SLCGSMLVSTLSNYGKLLQVAQNVLTTR 

176 5 ttttcacagaagacgagattgaaatgttcaagaacgtaa 1803 

57 FSQKTRLKCSRT* 
dplORFlOl 

19220 gtgataattttagcccagttcccactacatttgaaagcgcgattaggtcatctaggctgtctagctcgagttcgattacaaggt 

1 VI ILVQFPLHLKARLGHLGCLAR.VRLQG 

19304 tgccagtatcaatttcacaaaagtaagcgacatttccaactttctctagtgcttcacgatacctatcatatgtcgcctcttcgt 

29 CQYQFHKSKRHFQLSLVLHDTYHMSPLR 
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19388 caaatagtcgcgcagaataaacttcgaatttcattttag 19426 

57 QIVAQN KLRISF* 
dplORFl02 

4034 atgataacgtgggaatgtttgactgtatcgccgaactcgataaaattcctggtgtatttagacagcctaagacacgtgaacagc 

1 M ITWECLTVS PNS t KF LVYLDS LR'HVNS 

4118 ttttggaagcaccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattt 

29 FWKHHKFLGI I IYTCASEWLRKTSSYLF 

4 202 tccatatgggagaagactttaaatggctcaacttga 4237 

57 SIWEKTLM GST* 
dplORFl03 

493S2 ttgaatcatagatatagtaacatcacaactatttttctttggcagattgtctttctttgtatttgctgcgcggtgtcctattgt 

1 LNHRYSNITTIFLWQ* IVFLCICCAVSYC 

4 9436 gcaggagtgcataatgagcgagagtctcaagataaggtgattcaaagttataagcagaaagaaaagtcagccgtctacttgaca 

29 agVHNER ESQDKVIQSYKQKEKSAVYLT 

4 9520 gtcgatagttcaggagcttggctaggaagtgctccgggagccaaggaaagtcctctctacaatgaaaagggacagcatgtagga 

5 7 VDSSGAWLGSAPGAKES PLYNEKGQHVG 

49604 aaattgaaagaggtgggagagtga 49627 

85 KLKEVGE * 
dplORF104 

21427 atgagaaaaagagtgattttgaagctaaaaaggttgaactggtatgtccttaattcctactctcgaatggttgagtttttcgaa 

1 MR K RVILKLKRLNWYVLNS 'YSRMVEFFE 

21343 cttttgaacttttcgaatggttcgacttttcgaaggattgaggttttcgaaccggttgagtttttcgagcattctcgacttttc 

2 9 LLNFSNGSTFRRIEVF EPVEFFEHSRLF 

21259 gacccctttctatgctcgacttttcgagtgttttga 21224 

57 DPFLC STFRVF * 

2028 RF10S atgatagtcgcatccaccagttcgaatgaaaatagtcttttgacctataaccattccttcaccttgaattgcaggaccgaaaat 

1 M I V A S TS S N E N S'L LT YN H S F T h N C R T EN 

1944 ttccatgataggcattttctcagggtcgcgaacattgattcgaatcttgcctctttcaggctgattgtattgattaaccattat 

29 FHDRH FLRVANIDSNLAS FRLI V .LINHY 

1860 cctgctcctgctctaaaatttcgcggacagtaa 1828 

57 PAPALKFRGQ * 
dplORF106 

10529 atgaacctcgtcaatgatgtaaactttgaactcgctgtccatagacttgtatctagaatcttcaataatgtttcgaacattttc 

1 MNLVM DVNFELAVHRLVSRI FNNVSNIP 

10445 taccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcgaaaattcgagcagtagt 

29 YPI IRSSIN FNRRAKSFVHI LRENSSSS 

10361 ggttttaccagttccagcgccaccacagaatag 10329 

57 GFT SSSATTE * 

lO^S^ 107 atgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgtgacagtatcacaaggctcgaaaaagtcctcg 

1 MSVTPFRLL'GNLQMEECVTVSQGS KKSL 

10834 attatagtcatcacgttgacatggaagccgtttctaatgcactag 10878 

29 I IVITLTWKPFLMH* 
dplORF108 

49447 atgcactcctgcacaataggacaccgcgcagcaaatacaaagaaagacaatctgccaaagaaaaatagttgtgatgttactata 

1 MHS CTIGHRAANTKKDN L P KKNS CDVTI 

49363 tctatgattcaatttcgcttacctccaatcctcttacattgcttgcctgaaaatctagaaccactgaagtatcatatatacgac 

2 9 s miqprlPPILLH-CLPEMLEPLKYHIYD 

49279 tataaagcctttggcctaaaaggtcaataa 49250 

57 YKAFGLKGQ * 
dplORF109 

31632 atgtggttgtcgaagtcccaaatagttgattctccttcaactttccagcctttgaaagccttacctgttaaggtagggtcaact 

1 MWLSKSQIVDSPSTFQPLKAL PVKVGST 

31548 qqttttgqagaaatcttcttacctgcttcaactcgaactgcgtcggcggttcctgttccaccgttcaaatcgaatgtcacgcga 

29 G FG EIFLPASTRTASAVPVPP FKSNVTR 

31464 cgaagaaccgctggaagttgtgccacatag 31435 

57 RRTAGSCAT * 

dplORFHO . 

16444 atgatttcaattctagcatcaacttccatgtcgcgagtaagtgtgactccagtttcagcgacaggacatgctttgaataccgca 

x misILASTSMSRVSVTPVSATGHALMTA 

16528 atgtcaagttcgctctttctaataactgagcctaggtctaagtacaagttaggattgattccagtgaccttatattgtttctca 

29 MSSSLFLITEPRSKYK LGLI PVTLYCFS 

16612 gtttcttttacaggaatgctttcatag 16638 

57 VSFTGMLS* 

dplORFlll actcCatcaaqaaagctcttgcaattggtgttcaagg ^ 

x VTLSRKLLQLVFKVLGKTSCFLQVTLRM 

28741 tcatcgctgaaaaaacaggtcttcaaatcgctgtctactctaagaaaattgctcagtccgctgacgctgacaaacttcctgacg 

29 SSLKKQVFKSLSTLRKLLSSLTLTNFLT 

28825 ctggtaacattcgtcagttcaacttga 288S1 

57 LVTFVSST* 

^^O^ 112 atgcaaactgatttaggcaaatactgcttcgacgcagcagccgttgcttatattagatatttgcaggaagacaagactcctagg 

1 mqtDLGKYCFDAAAVAYIRYLQEDKTPR 

32291 tatcctggtgacgaaaagaaaaatccaggattgcaaatgcttatggagtga 32341 
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29 Y PGDEKKNPGLQMLME" 

dplORF113 aaaacagctaaagaagcaatcaaacaattcgg ^ 

1 MKTVKEAIK QFGDEWWYEI INENGQ MIQ 

17631 qacqqaaqaaccgaagacatgggcgaatacatggaagaaacggtcgaccaagttaagttcatcaactatggtgacatcgaatct 
29 DG R IE DMGEYMEETVDQVKFI NYGDIES 

17S47 caaattatcaaactatatatcgcataa 17S21 
57 QIIKLYIA* 

52952 F114 atgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacggattccctcgtattgaaaaactat 
1 M L LAKTG KQS I LI I V'HYAKTDS LVLKNY 

53036 ttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttatgttcaaaagattgttacattta 
29 F F N FTT M I R E KL KHGT EAV LMF KRLLHL 

53120 tcaataaatatggaagccttgtga 53143 

57 SINMEAL* 

dplORFH5 ctttttfct gatatata taatatacacgaattatcgcgagtttgtaaagccgtttctaaataattttaaatctttt 

1 M s LLFLIYI IYTNYREFVKPFLNKFK SF 

5258 aagcatattgagttttgcttcataagtcccgttcacggcagcctcttgcattttgagtacaatgaaaggaggttcctcgatatt 
29 KHI EFC F ISPVHGSLLHFEYNERRFLDI 

5174 gttgaaactatagaaggtgaataa 5151 

57 VETIEGE* 
dplORF116 

20662 atgaaattttcaaactttgctaaagcacttactaatgaatacctaatggtagtgaacaatgaccaagctgaagtcttaggcgca 
x MKFSNFAKALTNEY LMVVNNDQAEVLGA 

20578 ggaaatatcgaaaacattctcaacggttcgaactttgctaatgttgtagctgaagcgacagttttaaaactcgaaaaactcagc 
29 G N I E N I L NG S N FANVVA EATV L KL E K LS 

20494 gaagaggaagctattgagtag 20474 

57 EEEAIE* 

2468Q F117 atqataacaggctgcccgaacattttaaatcgaagtgaatctcgtaagtcactaatagttttgttcaagttatctgctactgtg 
1 M I T G CSN I L N RSESRKS L I VLFKLSATV 

24596 a taaggtctttgacatcgcttgtcccgtatatgtcattagtcaatggttcattaagaataactcgacaaggaatttgcttcaag 
29 IRS LTSL VPY MSLVNGSLRITRQGICF K 

24512 ccggttggggcggattcttga 24492 

57 PVGADS* 

1S023 P118 atgatattatctacgtcgacgcaacttgtgaaactattaaatacgaggagcctattgcatgaacaatcagcgaaagcaaatgaa 

1 MI LST STQLVKL LNTRSLLHEQSAKAME 

15107 caaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttccttcttgctgtaaaggaccacggcga 

29 Q T NR RTSR RL .STCKRSNKLPSCCKGPRR 

15191 agaactcgaaaaccttga 15208 

57 R T R K P * 

41054 atggaggttcaacatccccgattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagat 

X M E VQ HP RFSTSYFFGHFFSR HDFSGST D 

41138 tttaacagggaacaacttcctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggatccactat 

29 F NREQLP PN H VEHSSQLQ QCFRRLRIHY 

41222 ccaagcatttcacgctga 41239 

57 P S I S R * 

dplORF120 „ 

28387 qtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggtaaattcactgtcaaatcaactaacagcgaggctc 

1 VLKRKQNTCVCNCF N T V NSLSNQLTARL 

28471 aatacacttacgactacaacatggatgctaagcaacaatatgcagtcactaagaaatggactaacccagctgaaagtgacccta 

2g ntltTTTW MLSNNMQSL RNGLTQLKVTL 

28555 tcgctgacattttag 28569 

57 S L T F * 

dplORF121 gacggatcacqtQagttcagtttggaagataataatc 

1 VQTDHVSSVWKIIINNIWVITPIMSKQI 

39306 gcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattctttat 

29 A G I ELSIDGLTALPMFKWEVETSSLILY 

39390 ttgaatttggtttaa 39404 

57 L N L V * 

4o!o2 F122 atgttattctccttatcctacataccgaatcacgttcatgcctggattaaacgagtattgttccgttctaaatcggccgacttg 

1 M LFSLSYI PNHVH VWIKRVLFRSKSADL 

40318 aacggattgggtaaagatcccgttatcgatgtgaatgaacccttgcgtaaggtacataacttcattccctgcggagaacataga 

29 ngLGKDPVIDVN .EPLRKVHN FIPCGEHR. 

40234 aattcggtcacttga 40220 

57 N S V T * 

dp!ORF!23 atggttcgacttttcgaaggattgaggctttcgaaccggttg 
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1 MVRLFEGLRFSNRLSFSS I LD FSTPFYA 

21243 cgacttttcgagtgttttgaggttttcgagcaggttcgacttttcgagaaattgagtttttcgacctctaaattaggctcgatt 

29 RLFECF EVFEQVRLFEKLSFSTSKLGSI 

21159 attcgaaaagtttag 21145 

57 I R K V * 
dplORFl24 

17891 atggtaaaagttaaagatttgcaagta'ggaatgaaagttgtaaatgcaaaaggtactgaatttaaagtaactgaccgtcaaggt 

1 MVKVKDLQVGMKVVNAKGT E F KVTDR QG 

17 807 cgtaaatgggtaagcctagaacgtcttagtgatggacgtattcggttctatgataacgaatcactaatggacgaaaaagtggag 

29 RKWVS LERLSDGRI RFYDNES LMDEKVE 

17723 gtagtaaaatga 17712 

57 V V K * 
dplORF125 

4 9916 atgtcctcagccgcttccgttaaaattggaacaagtgaattatatagatgctcctcttttagcttgtcgataaggtattcatca 

1 MS SAASVKIGT SELYRCS S F S. L S I RYSS 

4 9832 gtttcgccaatttcgaaaaattcgaatccaggaaaatggtcgagaatagtttcgtcgtccggaactcttccatatctcgaaaag 

29 VSPISKNSNPGKWSRIVSSSGTLPYLEK 

49748 tgttcttga 49740 

57 C S * 
dplORFl26 

16136 atgagctcaagtacgttttctcgaacaatagggtcaagtccagttatatcaacgaactgtatatcgtcctcttgtataggaata 

1 MSSSTFSRTIGSSPVISTNCISSSCIGI 

16052 aggtctgcgtacagttgcatggctgaccctttaattggagtaactgttccttcactgtttattttaaataaggttatcatttct 

29 R SAYS CMADPLIGVTVPSLFILNKVII S 

15968 atcctctaa 15960 

57 I L * 
dplORF127 

13S11 atgctaaatagccttcccattcaccgtcgctgttcttgcgccatttttcagtttcacgatactgaccaactttgcaaaggtcgt 

1 MLNSFPIHRRCSCAI FQFHDTDQLCKGR 

134 27 gaaatagtgctacgattgcaactgtttccattgggtaaatgtcttcccagcctttgcctaccatggtatccatttcgaaaagta 

29 EIVLRLQLFPLGKCLPS LC LPW YPFRKV 

13343 gttgattga 13335 

57 V D * 
dplORF128 

4 852 atgacagcagttcaacaagttaagttctacttagaagaagccggcgctcactttctaaaagatgttgagtacagtgacaactta 

1. M TAVQQVKFYLE E AG A H F L KD V E Y S D N L 

493 6 gagcaagcaattatgaaagatattcttaaatggaatggcgctcatagagatgagcacgatatgaaaataacttcatacgaagta 

29 EQAIMKDILKWNGAHRD EHDMKITSYEV 

5020 ttatag 5025 

57 L * 
dplORF!29 

25133 atgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgcagccaccaatcttacattgaagaattcagta 

1 MN FLLSNLRSLKFKLMYAATNL TLKNS V 

25217 agaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcagcttgacgaaatctcagctggagcattgcctg 

29 RRKRRTRN'GNAFWKNLL SLT KSQLEHCL 

25301 tattag 25306 

57 Y * 
dplORF130 

16789 gtgcttgactttattcctttattatcgtataatcataatataaataaaacaagcgtcaaggacgcagaaagaggtcaattatgg 

1 VLDFI PLLSYNHNIN.KTSVKDAERGQLW 

16705 aaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcattcctg 

29 KQH F I SVI LQQIG KT VT RTT L S T M KAFL 

16621 taa 16619 
57 

dplORF131 

43846 atgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaacggaacttatccaa 

1 MLNRLRRNLAGRKMLLVSGTLEQTELIQ 

43930 aagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgctgagaaatggttga 

44013 

29 KMSSS I SKKTSLGSTLTTKATCSLRNG* 
dplORF132 

15304 gtgactggaaggtcatctaatacacatagcctcaagacatttcgttggctttcaggaaaacattcgactagattgtcaatgtat 

1 VTGRSSNTHSLKTFRWLSGKHSTRLSMY 

15220 cccacaaaggcttcaaggttttcgagttcttcgccgtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttga 
15137 

29 PTKASRFSSSSPWS FTARRKF I RPLA R* 
dplORF133 

8061 atgacttcttcattcatgacaagttttcgagtttctgcttgcttgtcaggaatagttttcccggcggctaaaatgtatagatta 

1 MTSSFMTSFRVSACLSGIVFPAAKMYRL 

7977 tcgtatttttctttcctgatagcagaacttgaatccatttgtattcccaccatttccgccctatctgcggcgaaataa 7900 

29 SYFSFLIAELESICI PTISALSAAK* 
dplORF134 

498 atgacttcaatgtacttaggttccatcaattcatacaagtcattcaaaataatgttcatgcaatcttcgtggaagtcaccgtgg 

1 MTSMYLGSINSYKSFKIMFMQSSWKSPW 

414 ttacggaaactgaataagtacaatttcaatgatttagattcaaccatcttttcgtttggaatgtaa 349 
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29 LRKLNKYNFNDLDSTIFSFGM* 
dplORF135 

780 atgaagcagaacttgaaaatgctgctaatgttgcaatgttctacggagtcaagttcaccattcttgaaattgactcgaaaatct 
1 MKQNLKMLLMLQCSTESSS PFLKLTRKS 

664 actcaagctctagctcttccttattacaaggaaaaggcgaaatttcacatggaaaatcttacgctgaaatcctag 93 8 

29 TQALALPYYKEKAKFHMENLTLKS* 
dplORF13 6 

S5252 gtgaagaaatcttcaataaccttattcgcttctttgacagatacattcatctgctcagcgattgagttagccccgcggccgcac 
1 VKKSSITLFASLTD TFICSAI E L A P R P Y 

55168 ataagacctaaaagaacggacttgacagaatttcttcgaagttttccttccttgttagtcgttccgtcgggatag 55094 
29 I RPKRTDLTE FLRSFPS LLVVP S G * 

dplORF137 

37146 atgcttcgaacttgtttgttagcaccgtcaggaggacaaactagtcgaacccattcacctgcgtctttgataatatctagcgcg 
1 MLRTCLLAPSGGQTSRTHS PAS LI ISSA 

37062 acagcgcctacagaagaagcaacgtgtttcaacttcctaggcaagccttctgctagttcataccataatgcgtag 36988 
29 TAPTEEATCFNFLGKPSAS SYHNA* 

dplORFl38 

30662 atgactatatcgaagaacaatgtagtcatccggcctatctgtatcttgctcgtcaaattcaactcctggaagcataggagcagg 
1 M T I SKNNVVIRPICILL VKFNS W K H R S R 

30578 cgagagctgaaatgtaggaagaatttccttcaatctgtccatcattgtcgttcgtttagtcatgttcactcctag 30504 
29 RELKCRKNFLQSVH HCRSFSHVHS * 

dplORF139 

120 92 atgatactaaatcactcaacttgtttgaccctcctgataaattcgttcacgcagacacgcgcatttgagccctttttagatacc 
1 MILNHSTCLTLLINSFTQTRAFEPFLDT 
12008 tttcgcaaacacctagatgcttccctcactaaaaggtcatgggcctcaagttcttcgaaagacatttctacatag 11934 
29 FRKHLDASLTKRSWASSSSKDIST* 
dpl0RFl40 

20562 atgttttcgatatttcctgcgcctaagacttcagcttggtcattgttcactaccattaggtattcattagtaagtgctttagca 
1 MFS I FPAPKTSAWSLFTT I RYS LVSALA 

20646 aagtttgaaaatttcattttattttccctttatttgtttttctttatactattattatacaataatgattga 20717 

29 KFEHFILFSL YLFFFILLLYNND* 

dplORF141 

42922 gtgctaagagttgtagagatatcctctaaaacgctcttggctttattcgatttccattcgaataacttatttagtaggacagta 
1 VL RVVEISSKTLLALFOFHSNNLFSRTV 

42 838 agcactccgctgcacgctgtaataatcgtcgtcaagactgctgtgtcgtttagccacattggcatagattga 42767 

29 STPLHAVIIVVKTAVSFS HIGID* 

dplORF142 

31898 gtgactgtcgaagtttctccaaacagttctgtcactttacctaaaagcgtattagggattttcccgttagcgattaggttcatg 
1 VTVEVSPNSSVTLPKSVLGIFPLA IRFM 

31814 acacctgctgctcgaattttaacatggataggttcactaccttttgaaaatcctggaagtgcgatgatttga 3174 3 

29 TPAARILT WIGSLPFENPGSAMI * 

dplORF143 

7S65 atgaagtttgggttgacgcttttaactccagaccgtttaattttttcaaggcttgaaattggataccatataatcttttcatgc 
1 MKFGLTLLTPDRLIFSRLEIGY H. I I F S C 

74 81 ttttggaaatacactaaaattccggcgagaataaatttgcatccatctgcgcgtgatagctggaaccattga 7410 

29 FWKYTK I PARINLHPSARDSWNH* 

dplORFl44 

36517 gtgcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaattcaacaattacca 
1 V Q I K R ,L T Y L D TLNEAHS S R F LM E I QQLP 

36601 . ttgaataccgagccgatgacgcagcagcttggacctctactcttcccgctcaagttgaactgtttctaa 36669 
29 LNTEPMTQ QLGPLLFPLKLNCF* 

dplORFl45 

4 2067 atggaaacagctggagacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataattggcagaaacttgttc 
1 METAGDLTSGKRFYLSKTSNRI IGRNLF 

42151 ttcaaagtgggtggaaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatag 42219 
29 FKVGGTITQPMATHSIRKLLTA* 
dplORF14 6 

514 8 4 atgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttcaaccgtcgaagtgttcgtt 
1 MTNCMIASPF QYGTSRAKQYSSTVEVFV 

5156 8 ctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagcttgtag 51636 
29 LSFTSTVKMTLKRNFFMANMSL* 
dplORF147 

55207 atgtatctgtcaaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttc 
1 MYLSKKRIRLLKISSPSSLKWQTISYSF 
55291 aacagcaggcgcaggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatga 553 5 9 
29 NSRRRTWDMF KQLPVBEEGFLI* 

dpXORF148 

28636 gtgtttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaaaatgtcagcgata 
1 VFRFKTIRVGRTPVRFSMSS IAAKMSAI 

28552 gggtcactttcagctgggttagtccatttcttagtgactgcatattgttgctcagcatccatgttgtag 284 84 
29 GSLSAGLVHFLVTAYCCLASML* 
dplORF149 

26474 atgccattgaacttttcgagcataaggattaaccttgccccattgtctcactccagctgtggcggaatggctaatggtagttcg 
1 M PLNFSS I RINLAPLSHSSCGGMANGSS 

26390 agcaagtcgaagggcattgtattcgagattttgatatttatgagcagcaggtttccctag 26331 
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29 SKS KGIVFE ILIFMSSRFP* 
dplORFl50 

151 S 5 gtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttgatagtcttcgcgaagttcgacgattcgtttgttcat 

1 VVLYS KKEV YSTSCTLI VFA KF DDS FVH 

15101 ttgctttcgctgattgttcatgcaataggctcctcgtatttaatagtttcacaagttgcgtcgacgtag 15033 

29 LLSLIVHAIGSSY. LIVSQVAST* 
dplORF151 

280 27 atgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttggaccaactcttt 

1 MIISTQGRLLATFKHFLQTLFNTLDQLF 

2 8111 tccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaa 28176 
29 SLMLNKQGQTFHGS RVQI ICQ • 
dplORF152 

42235 atgtgcataaaggacttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttc 

1 MCIKDLSTKRLLLQYFLKD LDRKFQCIF 

42319 aggctctcaataactcatatggaaatgccattctatgtatatacactgacggaagacttgtggtga 42384 

29 RLS ITHMEMPFYVYTLTEDLW * 
dplORF!53 

22307 atggtggacaaagggctcaccttttcgaactttcgatatcgtcatagcagacggttccattcgttcaggaaaaacagtatcgat 

1 MVDKGLTFSNFRYRHSRRFHS FRKNS ID 

22391 ggctctttcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaa 22456 

29 GSFIFPLGH DGIQRTKLCHLW* 
dplORF154 

18446 gtgacaataggctttaagaactgcaaaaaaacctggggcgtctgcacgcgcaacctggagctccttaacagtcatccaaggctg 

1 V T I G F KNCK KTWGVCT RN L E L L N SH PRL 

18530 aggtttcttacaaacaatcctaattccttcaaaatagctcttgtccgggtcaatagtgcctaa 18592 

29 RFLTNNPNS FKIALVRVNSA * 
dpiORF155 

13512 atgaatacgaccctgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttcattcaactcacgccag 

1 MNTTLSNLQWDMVQNLI S FFNVS FNSRQ 

135 96 ttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttga 13658 

29 LKLKQFSGIWEPMI LVLMQI * 
dplORF156 

18777 atgctagtatctccatttctgttggtcttgctttttagctctgttcagttcagctgcttctcgcgatgcaatagtttcgagaat 

1 MLVS PFLL VLLF.SSVQFSCFSRCNSFEN 

18861 atgcctgttcataggctcacaatattccgccaaagatttgccagttatggtggcgtcaattaa 18923 

29 M PVH RLT I FRQRFAS YGGVN * 
dplORF157 

13281 gtgcttgctggacttgagaagaaattggtatcattttcgagccaatccataaggttctcgataccgtcacgattgattgtttct 

1 VLAGLBKKLVSF. SSQSIRPSIPSRLIVS 

13197 gttactgctttcttgaagcgttttttaaagtctgtcatattagacccctttcattttctataa 13135 

29 VTAPLKRPLKSVILD-PFHPL* 
dplORF158 

40727 gtgaacgccgttattagggtcaaacgaagcccaaacggacattgtctttgtcccgtcactattgtgaggaacagtcacttctcc 

I VN AVIRVK RSP NGHCLCPVTIVRNSHFS 

4064 3 acttgcgagcgttacctcttcgccggacgtgtcgtagtctgggtgactgctatgaacacttga 40581 

29 TCERYLFAGRVVVWV TA MNT* 
dplORF159 

30371 atgatttggtctgcgcttacccaagcagcttctcctttgagtttctgtcgagcattccctgtacggtctgtccaaatagcatgc 

1 M IWSALTQAASPLSFCRAF PVRSVQIAC 

30287 gtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttga 30225 

29 VFAYSSILVAATSQTVM TAT* 
dplORF160 

41324 atgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaa 

1 MGYRHARKT I ERPRRIYQCYRI LWTVYQ 

41408 tttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaa 41467 

29 FLRSTYSSKSCN YPSSSKC* 
dplORFl61 

5217 5 atgcaaaaaggtttaaatgcttatctcgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttca 

1 MQKGLNAY LDMTLKALHSRL FQNVWQRS 

52259 aatcaaaccaaggggccaagttttcaacttaccttacaagactcttcaagaatagaatag 52318 

29 NQTKGPSFQLTLQDSSRI E * 
dplORFl62 

13020 atgacagaagttgcggtaaatagcccgcaaaaggtgagagtagttatggtcgggaatattgaatttctcgaatatttaaaaagg 

1 MTEVAVNS PQKVRVVMVGN I EFLEYLKR 

13104 aagtacggaacagaaacttccatcagttatattatagaaaatgaaaggggtctaatatga 13163 

29- KYGTETSISYIIENERGLI * 
dplORF163 

40224 gcgaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatcctca 

1 VTEFLCS PQGMKLCTLRKGS FT S ITGSL 

40308 cccaatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatga 40367 

29 PNPFKSADL ERNNTRLIQT * 
dplORF164 

6696 atgtactcttggagaacttcgtgcctaaatgttccagcttcgcccattgcaattaggttagaatctgcgttatctataacagac 

1 MYSWRT SCLN.VPASPIAIRLESA-LSIID 

6612 tcaccgattctttcgaaatacatttttcgaatacatccaccaaccccgctgggcttataa 6553 

29 S PILSKYIFRIHPPTPLGL* 
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dplORF165 

50504 atgagtgaaagctggtcaatccccaccacagatggtctatatttagatatcatgctatctaaaattgcaggggtaaggttcttt 
1 MSESWS I PTTDGLYLDI MLSKIAGVRFF 

504 20 cctccaatcataaagggcgtgactaccacaagggaattttcagcctcagtcattgcttga 50361 
29 PPI I KGVTT TREFSASV I A * 

dplORF166 

23 519 gtggtcatgctctttaatgactctatcttctcccgtttggctcgctttactgtcccagctgtaagcatagtattcatcaatgtc 
1 VVMLFNDSIFSRL ARFTVPAVS I V F I N V 

23435 gtgcgtgttgctagggtcgagtgtaaatctattctcagccaagagttcagcgtgaaatga 23376 
29 VRVARVECKSILSQE FSVK* 

dplORF167 

1008 atgcttattcggttggagcttcttacgtcgtatatggtgctcacgcagacgatgcggctggaggtgcttaccctgattgcactc 
1 ML I RLE LLTSYMVLT QT M RLE VLTL IAL 

10 92 ctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaa 114 8 

29 LSSIIQCQMQWNMELEAR* 
dplORF168 

54 345 atgagactttttccaggttatattcttcacattgttcagttcctggagtcaagtattgttcttgaaattcatagagttcgaaag 
1 MRLFP GYI LHIVQFL ES S IVLEIHRVRK 

54 261 tttgcaaagggtcataggccgcatacatataggcaacatcaggaggaattaaactaa S4 205 
29 FAKGHRPHTYRQHQE ELN * 

dplORF169 

4 5954 atgaacacagcatcgcgaagagtttcaatgttagtgataaggaagaattcgtcgtggccaccaagcaagtcttctgcccgttta 
1 MNTA SR RVSMLVI RKN S S W P PS KSSA RL 

45870 gaaactccgtcaatcactaatttcccatctttagtgactcgacttcctaaaatatga 45814 
29 ETPSITNFPSLVTRL PK I* 

dplORF17Q 

27600 atgatgattgttcttgtgctcctgccgtttgttgagcagcagcaagttgcttaccaaaagagccgatttcacgaggttcgggaa 
1 MMIVLVLLPFVE QQQVAYQKSRFHEVRE 

27516 caccaccaccgacacgacctggatttcctaaatttccagtcccggctggcgacttag 27460 
29 HHHRHDLDFLNFQSRLAT * 

dplORF171 

4 7678 atgtcattttctttcatgtactcttttagagcatcacgaagacttttgacttgtttctccatgtcgcctttggtagcatttaat 
1 MS F S FMY S FRAS R R L LT C F S MS P L VAFM 

47594 tcaccggcttcttcaattgcagcgatgaactgtttttcatcttcaaatttcatttaa 47538 

29 SFASSXAAMN-CFSSSNFX* 



dplORFl72 

104 62 atgtttcgaacattttctaccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcg 

1 MFRTFSTPLLEAAS I S IGEPSPLFTSF A 

103 78 aaaattcgagcagtagtggttttaccagttccagcgccaccacagaatagatag 10325 

29 KIRAVVVLPVPAPPQNR* 
dplORF173 

32160 atgacattagacatttccttcgtctgtacgaaaggtttcagcttgagtcacttcaccgtacattgcactgaagattgtcataag 

1 MTLDZ S FV CTKGFS LS H FTVHCTEDCHK 

32076 ttgctcatctgtcatatactcgccgacttcagcgtaagtaggctctaccattga 32023 

29 LLICHILADFS VSRLYH * 
dplORF174 

29765 atgtcccatcagcccttttcattaagattgtcgaaccagcgttcgacttttcatcagtttcaagctgttcttgcttatattggt 

1 MSHQPFSLRL SNQRS T FHQFQA VLAYIG 

29682 cataatagaattgcgccatttgtttccagtagtctgcgtcaccttttagactga 29629 

29 HNRIAPFV SSSLRHLLD * 
dplORF175 

1564 8 atgcgcgtgatgtcatggcagataggcgaggataaagagtgtcgaatagaacgccgcagagcttacgagagcgccaaatacaag 

1 MRVMSWQIGEDKECR I ERRRAYESAKYK 

15564 ggcgacggtactacggtggtcctcttgcttacctgtaaccaaataaaccattga 15511 

29 GDGTTVVLLLT CNQ I NH * 
dplORFl76 

43031 gtgataaagacggtaacgttgaatttttctagttccgtcttgaatgacgtcattttggtgattgattgctactgtcgtttggtc 

1 VI KTVTLNFSS SVLNDVILVIDCYCRLV 

4 2947 aatcccgtcgacctgctgtctaagagtgctaagagtcgtagagatatcctctaa 4 2894 

29 MPVDLLFKSAKSCRD I L * 
dplORF177 

19937 atgaacctaaacagttcgagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcata 

1 MNLNSSRLLKLLGKKQVEYFGGNVNLVI 

19853 ttctcgcgactaattttaggtgcttttgtattaatcagcgtgatatgcgcttga 19800 

29 FSRLILGAFVLISVICA* 
dplORF178 

11924 atgacaactgtcgaccaatttaaaagacagttgaggaaaagtttaggctcaatttttccttcatcagtttccttaaatttgagc 

1 MTTVDQFKRQLRKSLGS I FPSSVSLNLS 

11840 caattagtaacctttagcgaattgctagcacttgcctcccatattaagtcataa 11787 

29 QLVTFSE. LL.ALASKZKS* 
dplORF179 

56058 atgggtagggttattccttacctcgttgatttgctttatgcaaaacctaccacaatcgcttgtcgtggcttcaggagttgcatt 

1 MGRVI PYLVDLLY AK PTT I ACRG FRSCI 

56142 tcggataagtcaaaaagcaagtgtctttatattcgacaagccctcgaataa 56192 
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29 LDKSKSKCLYIRQALE * 
dplORFlBO 

41176 atgttcgacatgatttggaggaagttgttccctgttaaaatctgtcgaaccgctgaagtcgtgtctactaaagaaatgcccgaa 

1 M FDM I WRKL 'F PV K I C RTABVVSTKEMPE 

41092 aaagtaggacgtactgaatcggggatgttgaacctccatccgtttgaatag 41042 

29 KVGRTESGMLNLHPFE * 
dplORF181 

13126 atggaagtttctgttccgtacttcctttttaaatattcgagaaattcaatattcccgaccataactactctcaccttttgcggg 

1 MEVSVPYFLFKY SRNS IFPTITTLTFCG 

13042 ctatttaccgcaacttctgtcataggctgtcctcctttgcttatactgtaa 12992 

29 LFTATSVIGCPPLLIL* 
dplORF192 * 

4 5369 gtgcttgcccatgtttcaataaatagggttcgacctcgcctagctttcgaacgtgctataacgatttcaatcatagcgaagaaa 

1 V LAHVSINRVRPRLAFERA ITISIIAKK 

4 5285 ggtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttga 45235 

29 GEKLQS I PLRCQ YLLP * 
dplORF183 

13896 gtgattccagcttttggtttttcttcagcctcttcaactttttcttccttaggcgcaggtttcttacgagttgaactcttaggt 

1 VI PAFGFSSAS STFSS LGAGFLRVELLG 

13812 ttttcttcaactacttcttcaacctcagcctcttgttcaactggaccttga 13762 

29 FSSTTS.STSASCSTGP* 
dplORF184 

53330 gtgaacttgccgtcaaccacgtcaaacatttggtcttcgtcgaggtctaaaattagagttccaagaagttcgctcttttctgga 

1 VNLPS TTSNIWSSSR SKIRVPRSSLFSG 

53246 aaatcttcaagagtagcactgtcttccggacgctctggaaggaattcataa 53196 

29 KSSRVALSSGRSGRNS* 



dplORF185 

22522 atgaaattcgagatgttcgaaatgaaaatctacttattattagacactttagaaatggcgaagaaattgtcaactacttctata 
1 MKFEMFEMK I YLLLDTLEMAKKLSTTSI 

22606 tatttggaggaaaagatgagtcgagtcaagaccttatacagggggtaa 22653 

29 YLEEKMS RVKTLYRG * 

dplORF186 

21272 atgctcgaaaaactcaaccggttcgaaaacctcaatccttcgaaaagtcgaaccattcgaaaagttcaaaagttcgaaaaactc 
1 MLEKLNRFENLN P SKSR T IRKVQKFEKL 

21356 aaccattcgagagtaggaattaaggacataccagttcaacctttttag 21403 
29 NHSRVGIKDIPVQPF * 

dplORFl87 

34415 atggtcttgttcaatctcttcctactatcattcaagcagctgttcaaattatcactgctttattcaatggtcttgttcaggcac 
1 MVLFNLFLLSFKQLFKLSLLYSMVLFRH 
34499 ttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataa 34546 
29 FLRLFKQVFKF CQLS * 

dplORF188 

35609 atgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagtgacaaccctaaccaacctcagtcacaatcta 
1 MFVKQPVRLEWTCS I QEVTT. LTNLSHNL 

35693 aaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtag 35740 
29 KTIKASKPLSTLEQS * 

dplORF18 9 

42587 atgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaacttcgagctgacg 
1 MQTQY QPS LKLFMTQTCMLRTVENPELT 

42671 agcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctag 42718 
29 SKNFAKLVTQSKMKF* 
dplORF190 

39786 atgtattcactcaaagttgttcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcag 
1 MYSLKVVQCGSI I LKSNLVISLLLLVKQ 

3 9870 aggaagaccttaaatatcgaattgactcaaaagccgatcaaaagctaa 39917 
29 RKTLNIELTQKPIKS* 
dplORF191 

40996 atgtccattgttccggaacttgatttaggtaagtaccttgctaagtccagtgacggcgtaaaggatacgctagtagtatggttc 
1 MSIVPELDLGKYLAKSSDGVKDTLVVWF 
40912 ttacctaaatctatccagtcgctaccgaaaactcggtaccaaacttga 40865 
29 LPKSIQSLPKTRYQT* 
dplORF192 

2920 atggtcgacgtcgaatgttttttcgagatgaagtttagggtcttctcgataccctacggtatgttcagcgagtgctttaacaaa 
1 MVDV ECFFE^MKFRVFS I PYGMF SECFN K 

2836 acggaatggagtatcttgcaacccgtcacgttctgcgtcctcgcctaa 2789 

29 TEWS I LQP VTFCVLA* 

dplORF193 

42456 atgatttcagctcaaattaaatacgaaatgagacattgtctaaatttaaccaagaattatctacattcgatttcaccacaagtc 
1 MISAQI KYEMRHC LNLTKNYLHSISPQV 

42372 ttccgtcagtgtatatacatagaatggcatttccatatgagttattga 42325 
29 FRQCIYIEWHFHMSY * 

dplORF194 

40284 atgaacccttgcgtaaggtacataacttcattccctgcggagaacatagaaattcggtcacttgataccttaatggtagagcta 
1 MNPCVRYITSFPAENI. EIRSLDTLMVEL 
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40200 ccgtcgttcttaccgataattagaccttcattagaagagctcatgtaa 401S3 

29 PSFLP I IRPSLEEL M* 
dplORP19S 

42584 atgttcacaatcgttgttttgacaagtttcttttcagctccttgtccaatagtgaactctgccacaatttggcgcgattttgta 

1 mfTIVVLTSF.FSAPCPIVNSATIWRDFV 

42500 aggttcaacatagttctcacctcctttctaaaaaatattataacatga 42453 

29 RFNIVLTSFLKNI IT* 
dplORF196 

11273 atggtagatttaacaagtccctgtccaatcatgtcactcctccttgctcatcaaaagaagtttggtttcaattatcggtttagc 

1 MVDLTSPCPIMS LLLAHQKKFGFNYRFS 

1118 9 attaggctcccatttaacaactccagcaagttcattcatttcttctag 11142 

29 iRLPFNNSSKF IHFFr. * 
dplORFl97 

7484 atgaaaagattatatggtatccaatttcaagccttgaaaaaactaaacggtctggagttaaaagcgtcaacccaaacttcatcg 

X MKRLY G I QFQAL KKLN GLELKASTQTSS 

7568 atgcagggtatgaagtttcttacaagaagcgtcgaactagattga 7612 

29 MQGMKFLTRSV ELD* 



dplORF198 

24119 atgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagttgaccctagaaacccttccagcttgc 
X . MPLNK LTSSFIQCLSS P IQLTLETLPAC 

24203 tttctgttgacattgtttatcaggacgagcgtacaaaaggaatga 24247 

29 FLLTL FIRTSVQKE* 

dplORF199 

15742 gtggctcctgaattaggccgtacttttcctcccaactgcttagcaactgccttctcttgtctagcactagctctgcgcgtggga 
x V A P ELGCTF PPNCLATAFSCLALALRVG 

15658 attggtttgtatgcgcgtgatgtcatggcagataggcgaggataa 15614 

29 IGLYARDVMADRRG * 

dplORF200 

47843 atgacaggcttgtattcgataagccctgaaagtttttcacacatttcttccgtctcggcttcgtcaactaatttttcgataatt 
1 MTGLYSISPESFSHIS SVSAS STNFSII 

47759 tctttcaagcgttcttcgtccatagttgagcgctctgtcgtgtag 47715 
29 SFKRS.SSIVERSVV* 
dplORF201 

38569 atgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttggacctataccgattcaactaccga 
1 M G FTS SF FNQRS I S LDSNYLD LYRFNYR 

38653 aacgggctatcaaaaaacctacattccaaaagacgggaatga 38694 
29 NGLSKNLHSKRRE* 
dplORF202 

44483 gtggggcgtttattttttataaaaattttttacaaaatgcttgacaacattcactcattatcgtataatacaattataaaaata 
1 VGRLFFI KI FYKMLDNIH SLSYNTI I KI 

44567 aataaagccgaaaggcgaggaggacattatgtcaaaaattaa 44608 
29 NKAERRGGHYVKN* 
dplORF203 

22781 gtgattaggattggccgggttacaagagaaccacattttcgaacctgttacggaacagcgccctgtcgcttggttgacaaacga 
X V I R I G RVTRE PH FRTC YGTAP C RLVDKR 

22697 ttcaggcatcagtgccacctcatcacagaagatacctgctaa 22656 
29 FRH QCHL ITEDT C* 

dplORF204 

1471 atgaccacggttcgagtcaagggatggttgttgacttttatcacgtcaagaaaatcgcaggtacattcattgacagacttgacc 
X MTTVRVKGWLLT F I TS RKS QVHS LTDLT 

1555 acgctgttcttcttcaagggaatgaaccaatcgctttag 1593 

29 TLFFFKGMNQSL * 

dplORF205 

8524 gtgacactgatgaatggttctcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaatttagaa 
X VTLMNGSQFGML LVTQ I S STTKELPNLE 

8608 ttcaggaaaagcaacctgctatcaagttcaatttcgtag 8646 

29 FRKSNLLSSSIS* 
dplORF206 

19855 atgaccaagttcacgttcccaccaaaatattcgacctgcttctttcccaacagcttgagaagtctcgaactgtttaggtccacc 
X mtkFTFPPKYSTCFFPNSLRSLELFRFI 
19939 aaattgttcaacttgagcaagtgcgatattattctttag 19977 
29 JCLFNLSKC DI I L * 

dplORF207 

27502 gtgtcggtggtggtgttcccgaacctcgtgaaatcggctcttttggtaagcaacttgctgctgctcaacaaacggcaggagcac 
X vSVVVFPNLVKSALLVSNLLLLNKRQEH 
27586 aagaacaatcatcattccttaaataataggaggaactaa 27624 
29 KMMHHSLMNRRN* 
dplORF208 

47279 atgtttggtatgaagcaaaagacttcgctgaagaaaataacattcacttcccgtttgttcttcctgaacctagaacagaccctg 
1 MFGMKQKTSLKKITFTSRLFF L N L E Q T L 

47363 accatcgtggttctcgattctgggatgacgaaggcgtga 47401 
29 T I V V L O S G M T K A * 

dplORF209 

29784 atgttaagaatcaagttcgtagagccattgaaaccgctcctactaaaatcaaggtacttcgaaactcttgggtcagtgatggat 
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l mlRIKFVEPLKPLLLKSRYFETLGSVMD 

29868 atggaggaaagaaaaaggataaagcgaatgaagtcgtag 29906 

29 MEERKRI KRMKS* 
dplORF210 

S3077 atgtttcaacttttcccgtatcatggttgtaaagttgaagaaatagtttttcaatacgagggaatccgttttggcataatggac 

! MFQLFPYHGCKVEEIVFQYEGIRFGIMD 

52993 aattatcaggatggactgtttccccgtcttcgccaatag 52955 

29 NYQDGLFPRLRQ* 



dplORF211 

20959 gtgctcgactcttatgtcgcccctaatttttgtttttacttacggactatgggatttgtaggtattttcagggcgcttttttat 
1 VLDFYVAPNFCFYLRTMGFVGI FRALFY 

20875 ttacttattaagtccttttctatattagattgtttataa 20837 

29 LLIKSFSILDCL* 
dplORF212 

52983 atggactgtttccccgtcttcgccaatagcattgcaattgatatagcgtcgacgaccgtcaacgtctgcttcgtggactacgaa 
1 mdCFPVFANSIA .IDIASTTVNVCFVDYE 

52899 ataatccatgtcttcgccttccgggtcatcatacaatag 52861 

29 IIHVFAFRVIIQ* 
dplORF213 

30291 atgcgtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttgaaacttgtttcgataccg 
x mrlCVFFHLSSSDFADCYDSDLKLVSIP 
30207 ttcacagttactaacaaattcttcaggcttccatactaa 30169 

29 FTVTNKFFRLPY* 
dplORF214 

24273 atgatgccaaagttgtttttcagtgctcattccttttgtacgctcgtcctgataaacaatgtcaacagaaagcaagctggaagg 
1 M M P K It F FSAH S FCTLVL I NNVN RK QAGR 

24189 gtttctagggtcaactgtataggtgaactgaggcattga 24151 

29 VSRVNCIGELRH* 
dplORF215 

35822 atgttaccaaaccctgatagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaact 
1 MLPNPDRVS LLLLYNPLDSLSTSSLFRT 

35738 acgattgttccaatgttgacaacggtttgctcgccttga 35700 

29 TIVPMLTTVC SP * 

^S^ 216 atggcctcggagctcgcggccacatctcctccagatacggcagccaggtcaagtacccctggcatagcgtccatgatttcattt 
1 MAS ELAATS PPDTAARSSTPGIASMISF 

32765 acctggaaaccggctgaagctagattttccataccttga 32727 

29 TWKPAEARFSIP* 
dplORF217 

23443 atgaatactatgcttacagctgggacagtaaagcgagccaaacgggagaagatagagtcattaaagagcatgaccactgcatgg 
1 M N T M Ii T A GTVKRAKREKI ESLKSMTTAW 

23527 ataggaacagatatgcctgtctcactgacgctctaa 23562 
29 IGTDMPVSLTL* 
dplORF218 

22029 atggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggccaaaatgggcgacc 
1 M EC F R-KRFD I DYKLSARKLHC SGPKWAT 

22113 aggaaattgaaggcgaggttaaagataacttcgtag 22148 

29 R KLKARLKI TS * 

dplORF219 

51388 atgattttatgctcgactttttcagttctcccatttcttcgaaacgcttcagggctgacgccttgcctaactacttcgctagat 
1 MI LCSTFSVLPFLRNASGLTPCLTTSLD 

51304 gttccaaaattccttttcagccactggtttccatag 51269 
29 VPKFLFSHWFP* 
dplORF220 

6334 atqaaqttttcttcggtgacggttgatacaatttccttcaagagtaagctgttaaggtggcaagtgaattctttcttcgaaact 
1 v k FSSVTVDTISFKSKLLRW. QVNSFFET 

6250 ttcttgccagcagatgcgtacatgatgtcttcataa 6215 

29 FLPADAYMMSS * 

dplORF221 atgactgctcaagttctatqtactatqctctccgctcaqccggagctt 

1 MTAQVLCTM LSAQPELQVLDGQSILSTC 

43591 acgcatggcttattgaaaacggttatgaactaa 43623 

29 TKGLLKTV MN* 

dplORF222 

13212 gtgacggtaccgagaaccttatggattggctcgaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaa 
1 V T V S R T LW I GS KMI P I S S QVQ QALDTME 

13296 gctatgaaggtggacttgtcgagcactcattaa 13328 

29 AMKVDLSSTH* 

14055 F223 atgtggtggtacctgctggatatgttcgagatgtctactacttctacagtgaagtcgctgacgtctactacaagaaagatgtcg 
1 M W W Y L L D MFEMSTTSTVKSLTFTTRKMS 

14139 acgagcctgacgatgacagcgacattcttgtag 14171 
29 TSLTMTATFL* 
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dplORF224 

13621 atgccagaaaattgcttgagcttcaactggcgtgagttgaatgaaacgttgaagaaggaaattagattttgcaccatgtcccat 
1 MP ENCLSFN WRELNBTLKKEIRFCTMSH 

13537 tgtaagttgctcagggtcgtattcatatgctaa 13505 

29 CKLLRVVFIC* 
dplORF225 

32991 gtgagcaacgggtgcgacgtatttcatcgcctctgccatgtcgctagtttctgcgttcgtatcagctgctgctcgagcaaatac 

1 VSNGCDVFHRLCHVASFCVRISCCSSKY 
32907 gtcagccacgtgacccgcctggtttgcctctaa 32875 

29 VSHVTRLVC L* 

dplORF226 

2 5191 gtggctgcgtacattagtttgaacttcagtgagcgcaagttgcttagcagaaagttcatcgctaggaattggatagtggtgttc 
1 VAAY I S LNF S ERKLLS RKF I ARNW I V V F 
25107 gatagtcattgtcgtaagtgtttgataacttga 25075 

29 DSHCRKCLIT* 
dplORF227 

23115 atgactcaattagatggtagcgcttatgacgtttcgagaatccataaaggccgaaggttgttgcattatagataccaaagtcgc 
1 M T Q L D G S AYDVS R I H KG R R L L H YR Y Q S R 

23031 - ctgctacgaataaacggtcgaattctatattga 22999 
29 LLRINGR ILY* 

dp!ORF228 

104 50 atgttcgaaacattattgaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgc 
1 MFETLLKILDTSLW TASSKFTSLTRFI C 

10534 tttcaaccggagcatttaatgcgctgttga 10563 

29 FQ PEHLM RC * 

dplORF229 . 

27634 atgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttggctgctc 
1 MC E LRKLI 'LI KPLEALSQFLTTTLLWLL 

27718 aaattccagctaccgcagcaactcaagtag 27747 

29 KFQLPQQLK* 
dplORF23 0 

50723 gtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatatatgtggg 
1 VT K 'M PAYLNYLS LKTDMAKTE KS SN Z CG 

50807 acgttgaaactggaacctatactcttatag 50836 

29 T LKLEPILL* 

dplORF231 

31071 atgcgcgtgtcattgcgtttcacatcttcagttccctccgaggtcacggcttcgagttctgctgtttctgccgtatctacgaca 
1 MRVSLRFTSSVPSEVTASS SAVSAVSTT 

30987 aagttagctccgccgacttttggcaactga 30958 

29 KLAPPTFGN* 
dplORF232 

29385 atgtcaattccattagctcttgctaattcaacgagctcaggaacggttttagccgcatactcttcgcgcatttgttcaacttcg 
1 M S I P L ALANSTS SGTVLAAYS S R I C S T S 

29301 tcaatttcttcaactgattcaattgtttga 29272 

29 SISSTDSIV* 
dplORF23 3 

528 92 atgtcttcgccttccgggtcatcatacaatagagtgacaattgcgctgtcaccgtggtcagcgagtgtgaaaaactcgttatta 
1 MSSPSGSSYNRVTIALSPWSASVKNSLL 
52808 gaccctgagctaaatgttcctgatttttga 52779 
29 DPELNVPDF* 
dplORF234 

362 53 atgcttacgagtacagcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccag 
1 MLTS TATQLFERFISFNPLWEAIAYLTQ 

36337 gaagacctactcgacaatttagagtag 36363 
29 EDLLD NLE* 

dplORF235 

3 2768 atgaaatcatggacgctatgccaggggtacttgacctggctgccgtatctggaggagatgtggccgcgagctccgaggccatgg 
1 MKS WTLCQGYLTWLPYLE EM W PRAP RPW 
32852 ctagttcacttcgagcctttggattag 32878 

29 LVHFEPLD* 
dplORF2 36 

37528 atgttcgtcgcttttagatttagcaatatatcgaggcttcatgtggcgtgtagtaaaccacgaaacatcaatgagatattcact 
1 MFVAF RFSNISRLHVACS KPRNI NE I FT 

37444 tccattgttgatagaagcaaacgttaa 37418 

29 SIVDRSKR* 

dplORF237 

167 8 gtgagagtccaggtaaggaatcttgacatattctcagccgtagttctaaatccaaatagaactcgcttggtgtcaactgcattt 
1 VRVQVRNLDI FSAVVLNPNRTRLVSTAF 

1594 gctaaagcgattggttcattcccttga 1568 

29 AKAIGS FP* 

dplORF238 

1301 atgcctttttgcggtcgatacaagttgcgcaagttccacaactttcagcgtcactttcataacatgaacgagtcaagaaataag 
1 MPF CGRYK LRKFHMFQRHFHNMNESRNK 
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1217 gaacatctaaatcaattccccatttaa 1191 

29 EHLNQFPI* 
dplORF23 9 

26521 atggtgaagtatttcccatcgaagaatgtcctttcgaccatcctaatggaatgtgctaccaaactgtatggtacgaaaactcac 
1 M'VKY F LS KMVLST I LME CAT KLY GTKTH 

26605 tcgaagaaatcgctgatgagttga 26628 
29 SKKSLMS* 
dplORF240 

41893 atgtttggaataagcgtgaaacagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggg 
1 MFG I S VKQS LH GEVTNTRTT LRE LEVNG 

41977 gactatttcaaaatttctggttag 42000 
29 DYFKISG* 
dplORF241 

4 7020 gtgtctttccttaatatggagatagttttcattctatttaagcaggatatcgaaaaggttaccaattttagatttcataggctt 
1 VS FLNMEIVFILFKQDIEKVTNFRFHRL 

46936 accatctacgatataatctgctaa 46913 

29 TIYDIIC* 
dplORF242 

41338 gtgtctgtaacccatgctcttacggtagcggagccattaaagttcatcatacccaatttgccgccgtttccgttgatagcttgg 
1 VSVTHALTVAEPLKFI I PNLPPFSLI AW 

412S4 tttttacctacgagctcagcgtga 41231 

29 FL PTSSA* 

dplORF243 

51306 atgttccaaaattccttttcagccactggtttccatagaaccctccatcgtttcgacctaatacattcgagacgaattcagtta 
1 MFQNS FSATGFHRT LHRFDLIHSRRIQL 

51222 gtcctgaagtgtagccgcaagtga 51199 

29 VLKCSRK * 

dplORF244 

27083 gtgaggtacaaaatgttgaccgtcgccgtcaatgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcac 
1 VRYKMLTVAVNENFS IEFFRSFRNNFLH 

26999 ctgtttgatagttggctcatctag 26976 
29 LFDSWFI* 
dplORF245 

62 78 gtggcaagtgaattctttcttcgaaactttcttgccagcagatgcgtacatgatgtcttcataactgctagtagaagttttaat 

1 VASEFF LRNFLASRCVHDVFITASRS FN 
6194 tcgaagtcggtctttcaagaataa 6171 

29 SKSVFQE* 
dplORF246 

2 831 atggagtatcctgcaacccgtcacgttctgcgtcctcgcctaatagaccaaaaagtctttgaacggctgcctcagtattgtcca 
1 MEYLATRHVLRPRL I D QKVF ERLPQ YCP 
2747 aggttacaattccatccggcttaa 2724 

29 RLQFHPA* 
dplORF247 

29641 gtgacgcagactactggaaacaaatggcgcaattctattatgaccaatataagcaagaacagcttgaaactgatgaaaagtcga 
1 VTQTTGNKWRNSIMTNISKNSLKLMKSR 
29725 acgctggttcgacaatcttaa 29745 
29 TL VRQS* 

dplORF248 

53560 gtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacaggaagcctgcagttgaggttactt 
1 VQ S LVLARRTMLS YLLNG KTG S LQLRLL 

53644 acatttcaggaaacgctctaa 53664 

29 TFQETL* 
dplORF24 9 

2012 gtggatgcgactatcattgcaactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaag 
1 VDAT I IA TGVTQPL PGTVLLS RN I S Q A K 

2096 aagctgctagtcgaatcttga 2116 

29 KLLVES* 



dplORF250 

23 837 atgggcaaacatggaagattgacgaagactcagtcgactataaacctactcgagaaatccgaaactatattcgacaacttatca 
1 MGKHGRLTKTQSTINLLEKFETIFDNLS 
23 921 aaaagcaatcacgctttatga 23 941 

29 K S N H A L « 

dplORF251 

39205 atggaaataattagtcttaccgtctgcgcctggcttcccgggtatcccttgagctccgtcattccccttccatttcgtccatgt 
1 MEIISLTVCAWLPGYPLSSVIPLPFRPC 
3 9121 ataggctgcagggtcttttga 39101 

29 IGCRVF* 
dplORF252 

54 771 gtgttgtataggtcgaaactaattttgcatattttctatatctcaaaagtgcttttgagatatcgttatcaaaatgctcgacaa 
1 VLYRSKLILHIFYISKVLLRYRYQNARQ 
54687 tactttcgcctgttcctctag 54667 

29 YFRLFL* 
dplORF253 

56255 atggttgcgtctacaatagaaccgatgttgctagacaaagcatttgcaatcttcgagtctaatttattcgagagcttgtcgaat 
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1 MVAS I I EPMLLDKAFAI FESNL FES LSN 

56171 ataaagacacttgctttttga 56151 

29 IKTLAF* 
dplORF254 

4847* atgaacctttcgcttaggttcaatctttttcgaacattttcatatttaacaaaactttcagctaaaaatcgacaaagttcaatg 

1 MNLSLRFNLFRTFSYLT-KLSAKNRQSSM 

48395 ttcgactcaatgtttaaataa 48375 

29 FDSMFK*. 
dplORF255 

9572 atgctttggtcttctcgacgaatgactctactacattccctgcagggtttcgagcagtacgggtcaatgatgcaccgttttcgt 

1 MLWSSRRMTLLHSLQG FEQYGS MMHRFR 

9488 caaggtagtcaccttttctaa 9468 

29 QG SHLF* 
dplORF256 

15289 atgaccttccagtcactaatgcggccgctgaaattggataccactatacatgggttcaccaacttcgagacaaagcagttgaaa 

1 MTFQSLMRPLKLDTTIHGFTNFETKQLK 

15373 cacttgaagaaattttag 15390 

29 H L K K F * 
dplORF257 

28216 gtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgagtctatgcgacttggtgaaaaagaccgtcaaa 

1 VNVLDL- ANKLLRWHSSVSL. CDLVKKTVK 

28300 acttgcaaatgctattga 28317 

29 T C K C Y * 
dplORF2S8 

44023 atggaaattggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggcgagtcatggtactacttcaatc 

1 MEIG IGSTV TDTW LRHGNGLASHGT TSI 

44107 gcgatggttcaatggtaa 44124 

29 A M V Q W * 
dplORF259 

4 298 atgacCcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaaga 

1 MTRLRSIKTSGWKEYSKLFETVLIQTLR 

4382 ctcacgcatttgggatga 4399 

29 L T H L G * ' 
dplORF260 

24746 gtgaccctacttcctcaatcggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccag 

1 VTLLPQSAVLEASKLKSLP FQETSTSFQ 

24830 cggctgaatattatttag 24847 

29 R L N I I * 
dplORF261 

288 atgaattcacttccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatgg 

1 MNS LP FALKQD SLTS RM FS.LVT FQT KRW 

372 ttgaatctaaatcattga 389 

29 L N L N H * 
dplORF262 

940 8 atgcctattcaactccaggcggaaagatgtggaagcatgcttgtgcagttcgacttaaatttagaaaaggtgactaccttgacg 

1 MPIQLQAERCG SMLVQFDLNLEKVTTLT 

94 92 aaaacggtgcatcattga 9509 

29 K T V H H * 

dplORF263 

27052 atgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcacctgtttgatagttggttcatctagacctttt 

1 MKILASSSFEVFEIISFTCLIVGSSRP F 

26968 aacaagtcttctaattga 26951 

29 N K S S M * 
dplORF264 

613 9 gtgaatagtacaaggcggtctaatacgcccaggatttccgctgtagggatagccgcatcatcttcaaactcaattgagtcaagc 

1 VNSTRR SNTLRISAVGIAAS SSNSIESS 

605S tgtgaaacgtcttcataa 6038 

29 C E T S S * 
dplORF265 

4 801 gtgaataaagtcaagcgtttttgtataaaaagttcatttttttttaaaaaaaataagagcgaaaagctcttatctaaaatagtc 

1 VNKVKRFCIKSSFFFKKNKSEKLLSKIV 

4717 gacgttgacgatttttaa 4700 

29 D V D D F * 
dplORF266 

50220 atgcccgttcttccaagcagttgcaagcattttatcaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaa 

1 MPVLPSSCKHFINSPRLTLSRS SHYDNQ 

50136 atcctcaccaggaagtaa 50119 

29 I L T R K * 
dplORF267 

4 7367 atggtcaaggtctgttctaggttcaggaagaacaaacgggaagtgaatgttattttcttcagcgaagtcttttgcttcatacca 

1 MVKVC SRFR KNKREVNVI F FS EVFCFI P 

47283 aacattaatcgtagatag 47266 

29 N I N R R * 
dplORF268 
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12621 atgtcaatttcggtcttgtgcttgacaatggattcaactactgatgcgtcaacctttttcaatcgcgacagctcgtccaattca 

! MSISVLCLTMDSTTDASTFFNRDSLSNS 

12537 ttgtcaattctagagtaa 12520 

29 L S I L E * 

dplORF269 cg agtccat cagtttctacgtcaatagaacctattccgtcttcaatcattttgtctacatactgctcgagtct 

1 V NS IES ISFYVNRTYSVFN H FV YILLEF 

53750 tgcttcctcagtgattaa 53733 

29 C F L S D * 
dplORF270 

50792 atgatttttcggtcttcgccatatcggtttttaacgacagatagttcaagtatgccggatttttcgtcacgcttcatagcgata 

l mifRSSPYRFL'TTDS-SSMPDFSSRFIAI 

50708 actctgctagcattttga 50691 

29 T L L A F * 
dplORF271 

19739 atgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaattcatacctcaaag 

1 M R L LCFIFVTVLTDFLLANLPTR IHTSK 

19655 gctttttgtcagccttag 19638 

29 A F C Q P * 
dplORF272 

1556 gtggtcaagtctgtcaatgaatgtacctgcgattttcttgacgtgataaaagtcaacaaccatcccttgactcgaaccgtggtc 

1 VVKSVNECTCDFLDVIKVNNHPLTR T V V 

1472 ataagttccgcctgctaa 14 55 

29 I S S A C * . 
dplORF273 

56256 atggatttcattaggactgagtcctcttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttct 

x mdfirtESSWNWNGCIYRYSVSRTRPSS 

56340 agttcagtttatcttgcagtcaattgcttcgagatatttgaaaaagtagtcaggaaaattcctgattatcttgcagtcaattgc 

29 S S VY LAVNC F E I F E KVV R K I PDYLAVNC 

56424 ttcgagatatttgaaaaagtagtcaggaaaattcctgattattttttttacaaaaacgcttga 56486 

57 FEIFEKVVRKIPDYFFYKNA* 
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Table 31 



Query* aid) 114822 | lan| dplORFOOl Phage dpi ORF| 36698-40390 | 2 
{1230 letters) 

>gi| 928828 (L44593) ORF1904; putative (Lactococcus lactis phage BK5-TJ 
Length a 1904 

Score = 427 bits (1086), Expect = e-118 

Identities « 226/475 (47%), Positives = 281/475 (58%), Gaps = 45/475 (9%) 

Query: 395 AESGKYIGVLNTNKKPSELVPDDFTWIRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIAD 454 

A+ YIG + P DVTW + +G+ G GA G+DGV GK GVGI 
Sbjct: 820 ADYPSYIGQYTDFIQYDSAKPSDYTWSLI RGNDGKDGATGKDGV AGKDGVGIKT 873 

Query: 455 TAITYAVSVSGTQEPENGWSEQVPELIKGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNS 514 

T ITYA+S SGT +P GW+ QVP L+KG++LWTKT W YTD S ETGYSV YI +DGN+ 
Sbjct: 874 TV I T YALS S SGTD K PNTGWT S Q VPT LVKGQ YLWT KTVWT YTD S S S ETG Y S VT Y I AKDGNN 933 

Query: 515 GKDG I AGKDG VG I AAT E VMY AS S P SAT EA P AGG WS TQ V PTV PGGQ YLWT RTR WR YTDQTD 574 

G DG I AGKDGVG I T ♦ YA ST APA GW++QVP VP GQ+LWT+T W YTD T 
Sbjct: 934 G^^DGIAGKDGVGIKKTTITYAVGTSGTTAPASGWNSQVPNVPAGQFLWTKTVV^^YTD^^TS 993 

Query: 575 EIGYSVSRMGEQGPKGDAGR DGIAGKNGIGLKSTSVSYGISPTDSAIP-GVWASQVP 630 

E GYSV+ MG +G KGD G +GIAGK+G G+K+T+++Y SP + P G W++ VP 
Sbjct: 994 ETGYSVAMMGVKGDKGDPGNNGTNGIAGKDGKGIKATAITYQAS PNGTTAPTGTWSASVP 1053 

Query: 631 S L I KGQ YLWT RT I WT YTD S TT ETG YQ KT Y I P KDGNDG KNG I AG KDG VG I KSTT I T YAG ST 690 

+ KG *■ LWT RT I WT YTD +TT ETG Y Y+ +GN+G +G GKDG G I K+TT I T YAGST 
Sbjct: 1054 P VAXGS FLWTRT IWTYTDhTTT ETG Y A VAYMGTNGNNG KDG F PGKDGTG I KTTTI TYAGST 1113 

Query: 691 SGTVAPTSNWTSAIP^QPGFFLWTKTVWNYTDDTSETGYSVSKIGETXXXXXXXXXXXX 750 

SGT P + WTS +P V G +LWTKTVW YTD+TSETGYSV+ +G 
Sbjct: 1114 SGTT P PNNGWTSTV PTVAEGNYLWT KTVWT YTDNTS ETG YSVAMMG:- VKGDKGDP 1167 

Query: 751 XXXXXXXXXXADGRS -QYTHLAFSNS PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAYT 809 

DG+ + T + + SPNG A G + P +K +T 

Sbjct: 1168 GNNGTNG I AG KDG KG I KAT A I TYQ AS PNGT - -TAPTGTWSASVPPVAKGSFLWT 1219 

Query: 810 WTKW - KGNDG AQG I PGK PGADGKTNY FH I AY AS S ADG S 846 

T W GN+G G PGK G KT I YA S G+ 

Sbjct: 1220 RT I WTYTDNTT ETG YAVAYMGTNGNNGHDGF PGKDGTG I KTT - - TITYAGSTSGT 1272 



Score ■ 396 bits (1007), Expect » e-109 

Identities « 208/449 (46%), Positives » 260/449 (57%), Gaps = 42/449 (9%) 

Query: 421 I RLEG PKGD AG L PG A PGRDG VDG V PGKSG VG I ADT A I T Y A VS V SGTQE P ENGWS EQ V PEL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP ♦ 
Sbjct: 1155 VAMMGVKGDKG- - -DPGNNGTNGIAGKDGKGIKATAITYQASPNGTTAPTGTWSASVPPV 1211 

Query: 481 IKGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNSGKDGIAGKDGVGIAATEVMYASSPSA 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1212 AKGS FLWTRT IWTYTDNTTETGYAVAYMGTNGNNGHDG F PG KDGTG I KTTT I TYAGSTSG 1271 

Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGR---DGI 597 

TP GW++ VPTV G YLWTVT W YTD T E GYSV+ MG +G KGD G +GI 
Sbjct: 1272 TT P PNNGWTSTVPTVA£GNYLWTKTVWTYTDNTS ETG YSVAMMGVKGDKGD PGNNGTNG I 1331 

Query: 598 AGKNG IGLKSTS VS YG I S PTDS AI P - GVWASQ VPS LI KGQYLWTRTI WTYTDSTTETGYQ 656 

AGK+G G+K+T+++Y SP + P G W+ + VP + KG +LWTRTIWTYTD+TTETGY 
Sbjct: 1332 AG KDG KG I KAT A I TYQ AS P NGTT APTGTW S AS V P P V AKGS F LWT RT I WTYTDNTT ETG YA 1391 

Query: 657 KTYIPKDGNDGKNGIAGKDGVGIKSTTITYAGSTSGTVAPTSNWTSAIPNVQPGFFLWTK 716 

y+ +GN+G +G GKDG G I K+TT I TYAGST SGT P + WTS +P V G +LWTK 
Sbjct: 1392 VAYMGTNGNNGHDGF PGKDGTG I KTTT ITYAGSTSGTTPPNNGWTSTVPTVAEGNYLWTK 1451 



Query: 717 TVWNYTDDTSETGYS VS KIGETXXXXXXXXXXXXXXXXXXXXXXADGRS - QYTHLAFSNS 775 

TVW YTD+TSETGYSV+ +G DG+ ♦ T ♦ + S 

Sbjct: 1452 TVWTYTDNT S ETGYS VAMMG VKGDKGD PGNNGTNG I AG KDGKG I KATAITYQAS 1505 
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Query: 


776 


Sbjct : 


1506 


Query: 


818 


Sbjct: 


1558 


Score 


» 31 


Identities 


Query: 


421 


Sbjct: 


1311 


Query: 


481 


Sbjct: 


1368 


Query: 


541 


Sbjct: 


1428 


Query: 


598 


Sbjct: 


1488 


Query: 


657 


Sbjct: 


1548 


Query: 


717 


Sbjct: 


1608 



PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAYTWTKW KGND 817 

PNG A G + P +K +T T W GN+ 

PNGT TAPTGTWS AS VP PVAKGS FLWTRT I WTYTDNTTETGYAVAYMGTNGNN 1557 

GAQGIPGKPGADGKTNYFHIAYASSADGS 846 
G G PGK G KT I YA S G* 



= 179/322 (55%), Positives = 222/322 (68%), Gaps = 7/322 (2%) 

I RLEG P KG DAG L PGA PGRDG VDG V PGKS G VG I ADT A I TY A VS VS GTQ E P ENG WS E Q V P E L 480 
+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
VAMMGVKGDKG D PGNNGTNG I AGKDGKG I KATAITYQAS PNGTTAPTGTWSAS VPP V 1367 

IfCGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNSGKDGIAGKDGVGIAATEVMYASSPSA S40 
KG FLWT+T W YTD ♦ ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 



TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGR DGI 597 

T P GW+ + VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 



AGKNGIGLKSTSVSYGISPTDSAIP-GWASQVPSLIKGQYLWTRTIWTYTDSTTETGYQ 656 
AGK+G G+K+T+++Y SP + P G W++ VP + KG +LWTRTIWTYTD+TTETGY 



KTY I PKDGNDG KNG I AG KDG VG I KSTT I T YAGS TS GTVA PTSNWT S A I PNVQ PG F F L WT K 716 
Y+ +GN+G +G GKDG GIK+TTITYAGSTSGT P + WTS + P V G +LWTK 



TVWNYTDDTSETGYSVSKIGET 738 
TVW YTD++ ETGYSV K+G T 



Score = 201 bits (507), Expect » 2e-50 

Identities = 121/297 (40%), Positives = 156/297 (51%), Gaps » 19/297 (6%) 

Query: 421 I RLEG P KGD AG L PGA PGRDG VDG VPGKSGVG I ADT A I TYA VS VS GTQ E P ENG WS E Q VP E L 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP t 
Sbjct: 1467 VAMMGVKGDKG D PGNNGTNG I AGKDGKG I KATA ITYQAS PNGTTAPTGTWSAS VP P V 1523 

Query: 481 IKGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNSGKDGIAGKDGVGIAATEVMYASSPSA 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1524 AXGS FLWTRT I WTYTDNTT ETGYAVAYMGTNGNNGHDGF PGKDGTG I KTTT I TYAGSTSG 1583 

Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGRDGIAGK 600 

T P GW++ VPTV G YLWT+T W YTD + E GYSV +MG GP AG *G GK 
Sbjct: 1584 TTPPNNGWTSTVPTVAEGNYLWTKTVWAYTDNSFETGYSVGKMGNTGP- - - AGS NGN PGK 1640 

Query: 601 NGIGLKSTSVSYGISPTDSAIPGVWASQVPSLIKG-QYLWTRTIWTYTDSTTE--TGYQK 657 

+ T+ G++ S++ ++G+YW W + G 

Sbjct: 1641 WSDTEPTTKFKGLTWKYSGVVTJMPLGNGTKILAGTEYYWNGNNWALYEINAHNINGDNL 1700 

Query: 658 TYIPKDGNDGK-NGIAGKDGVGIKSTTITYAGS TSGTVAPTSNWTS AI PNVQ 708 

+ DGK I G +GV +TTGS +S+ TNTAINQ 

Sbjct: 1701 SVTNGTFKDGKIESIWGSNGV NGTTTIEGSHLQIHSSDSTTNTEN-TLAIDNRQ 1753 



Query* sid| 114823 | lan| dplORF002 Phage dpi ORF| 32386 -35835 1 1 
(1149 letters) 

>dbj | BAA31888 | (AB009866) orf 15 (bacteriophage phi PVLJ 
Length ■ 694 

Score = 280 bits (709) , Expect » 3e-74 

Identities * 157/465 (33%), Positives = 257/465 (54%), Gaps » 28/465 (6%) 

Query: 40 Q I G S ALTG LG KG LTT A VT L P LMG F AAAS I KVGNE FQAQMS R VQ AI AG AT A£ E LG RMKTQA 99 

+IG+++ +G+ +T VT P++ A + K G EF M +V+A +GAT EE +K +A 
Sbjct: 151 EIGNSMKNVGRNMTMYVTAPWAGFAVAAKKGIEFDDSMRKVKATSGATGEEFEALKKKA 210 
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Query: 10 0 I D LG AKT A FS AK E AAQG MEN LAS AG FQ VNE I MD AM PG VLD LXXXXXXXXXXXXXXMASS L 159 

+ +GA T FSA ++A+ + +A AG* + GV+DL ♦ L 

Sbjct: 211 REMGATTKFSASDSAEAIiNYMAIJ^GWDSKQMMEGLSGVMDLAAASGEELGAVSD^ 270 

Query: 160 RAFGLEANQAGHVADVFARAAADTNAETSDMAEAMKYVAPVAHSMGLSLEETAASIGIMA 219 

AFGL+A +GH+ADV A+ ++ N ♦ + EA KYVAPVA ++G ++EVT+ +IG+M+ 
Sbjct: 271 TAFGLKAKDSGHlADVI^QTSSKANTDVRGLGEAFKYVAPVAGALGyTIEDTSIAIGLMS 330 

Query: 220 DAGIKGSQAGTTLRGALSRIAKPTKAMVKSMQELGVSFYDANGNMIPLREQIAQLKTATA 279 

+AGIKG +AGT LR + ++ PT+AM M+ LG+S D+NG MIP+R+ + QL+ 
Sbjct: 331 NAGIKGEKAGTALRTMFTNLSSPTRAMGNEMERLGISITDSNGKMIPMRKLLDQLREKFK 390 

Query: 280 G LTQE E RN RH L VT L YGQNS LSGMLAL LD AG P E KLD KMTN AL VN S DG AAKEMAETMQDNLA 339 

L+++++ T++G + ++SG LA+++A E K+T ++ +S GA+K MA+TM+ L 

Sbjct: 391 HLSKDQQASSAATIFGKEAMSGALAIINASDEDYQKLTKSIDSSTGASKRMADTMESGLG 450 

Query: 340 SKIEQMGGAFESVAIIVQQILEPALAKIVGAITKVLEAFVNMSPIGQKMVVIFAGMVAAL 399 

K+ + E +A+ + +EPAL IV A +KV+ + Q W F VA L 

Sbjct: 451 GKLRTLRSQLEEIALTIYDRIEPALKIIVSAFSKVVTVfVTKLPTSIQLAVVGFGLFVAVL 510 

Query: 400 GPLLLIAGM -VMTTIVKLRIAIQFLGPAFMGTMGTIAGVIAIF 441 

GPL+ + G+ MT + L I + F IA ++ +F 

Sbjct: 511 GPLVFMFGLFISVMG^IA^^^VIX3PLLINVNKASGLFAFLRTKIASLVKLFPILGVSISSLT 570 

Query: 442 - - - - - - - YALVAV FMIAYTKSERFRNFINSLAPAIKAGFGGA 476 

ALV + F AY +SE FRN +N + F A 

Sbjct: 571 L P I TL I VG AL VG I G I A F YQ A Y KRS ET FRN I VNQ A I SG VANAF KAA 615 

Query= sid| 114824 | lan| dplORF003 Phage dpi ORF| 53538-55877 | 3 
(779 letters) 

>9p| P4374l|DP01_HAEIN DNA POLYMERASE I (POL I) >gi | 1074025 | pirj | E64098 DNA polymerase I 
(polA) homolog - Haemophilus influenzae (strain Rd KW20) 
>gi| 1573871 (U32767) DNA polymerase I (polA) 
[Haemophilus influenzae Rd] 
Length =930 

Score = 191 bits (481), Expect = le-47 

Identities = 148/553 (26%), Positives * 262/553 (46%), Gaps = 60/553 (10%) 

Query: 63 RLELITEEAKLEQYVDKMIEDGIGSIDVETDGLDTIHDELAGVCLYSPSQKGIYAPVNHV 122 

+ E + +A L ++++K+ + ++D ETD LD + L G + + + Y P+ 

Sbjct: 333 K YE TLLTQAD LT RW I E KLN AAKL I AVDTETD S LD YMS ANL VG I S FALENG EAA Y LP LQLD 392 

Query: 123 SNMTKMRIKNQISPEFMKKMI^RIVT)SGIPVIYHNSKFDMKSI^ 182 

++ + + k +L+ + I I N KFD +SI+ R G+++ +DT L 
Sbjct: 393 YLDAPKTLEKSTAliAAIKPILE NPNIHKIGQNI KFD - ES I FARHGI ELQGVEFDTML 448 

Query: 183 AAMLLNENESHSLKSLHSKYV1WEENAEVA1CFNDLFKGIPFSLIPPDVAYMYAAYDPLQT 242 

+ LN H++ L +Y> +E A + - + F+ IP ♦ A YAA D T 

Sbjct: 449 LSYTLNSTGRHNMDDLAKRYLGHETIAFESLAGKGKSQLTFNQIPLEQATEYAAEDADVT 508 

Query: 243 FELYEFQEQYLTPGTEQCEEYNLEKVSWVUiNIEMPLim,FDMEVYGVDLDQDKLAEIR 302 

+L ♦ L E Y +E+PL+ VL ME GV +D D L 

Sbjct: 509 MKLQQALWLKLQEEPTLVELYK TMELPLLHVLSRMERTGVLIDSDALFMQS 559 

Query: 303 EQFTANMNEAEQEFQQLVSEWQPEIEEIJIQTNFQSYQKLEMDARGRVTVSISSPTQLAIL 362 

+ ♦ + E++ L ♦ +++S QL + 

Sbjct: 560 NE I AS RLTALEKQAYALAGQ PFNLASTKQLQEI 592 

Query: 363 FYDIMGLKSPERDKPRG- - -TGESIVEH-- FDNDISXXXXXXXXXXXXVSTYTT-LDQHL 416 

+D + L ++ P+G T E ++E + STYT L Q + 

Sbjct: 593 LFDKLELPVLQKT - PKGAPSTNEEVLEELSYSHELPKI LVKHRGLSKLKSTYTDKLPQMV 651 

Query: 417 AKPDNRIHTTFKQYGAKTGRMSSENPNLQNIPSRGE-GAWRQIFAASEGHYIIGSDYSQ 475 

R+HT++ Q TGR+SS +PNLQNIP REG +RQ F A EG+ I* +DYSQ 
Sbjct: 652 N S QTG R VHT S YHQ A VT ATG RLS S S D PNLQN IPIRNEEGRHI RQAF I AREG YS I VAAD YS Q 711 

Query : 4 76 QEPRSLAELSGDESMRHAYEQNLDLYSVIGSKLYGVPYEECLEFYPDGTTNKEGKLRRNS 53 S 

E R +A LSGD+ + +A+ Q D++ ++++GV +E T+++ R + 
Sbjct: 712 IELRIMAHLSGDQGLINAFSQGKDIHRSTAAEIFGVSLDE VTSEQ RRN 759 

Query: 536 VKSVTjLGLMYGRGANSIAEQMNVSVXEANKVIEDFFTEFPKVADYIIFVQQQAQDLGYVQ 595 
K+ + GL+YG A +♦ Q+ +S ♦A K ++ +F + P V ++ ++ + +A+ GYV+ 
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Sbjct: 760 AKAINFGLIYGMSAFGLSRQLGISRADAQKYMDLYFQRYPSVQQFMTOIREKAKAQGYVE 819 

Query: 596 TATGRRRRLPDMS 608 

T GRR LPD + + 
Sbjct: 820 TLFGRRLYLPDIN 832 

Score = 46.9 bits (109), Expect = 5e-04 

Identities = 34/123 (27%), Positives « 66/123 (53%), Gaps = 16/123 (13%) 

Query: 663 EIKDQAKAEGI ■- LIKDNGGKIADAQRQCLNSVIQGTAADMTKYAMIKV 709 

+I+++AKA+G + N ♦ A+R +N+ +QGTAAD+ K AMIK+ 

Sbjct: 807 D I REKAKAQGYVETLFGRRL YLPD INS SNAMRRiCGAERVAINAPMQGTAADI I KRAMI KL 866 

Query: 710 HND AE LKE LG FHLM I P VHD ELLG E V P I KNAKRG AERLT E VM I EAAKD IIS LPMKCD P S I V 769 

++ + +++ VHDEL+ EV + E++ *- M EAA + + + +P+ + ♦ 

Sbjct: 867 -DEVIRHDPDIEMIMQVHDELVFEVRSEKVAFFREQIKQHM-EAAAELV-VPLIVEVGVG 923 

Query: 770 ERW 772 
+ W 

Sbjct: 924 QNW 926 

Query* sid| 114825 | lan| dplORF004 Phage dpi ORF | 40401-42440 1 3 
(679 letters) 

>emb| CAB07981 | (293946) hypothetical protein (bacteriophage Dp-l] 
Length » 532 

Score = 1011 bits (2585), Expect =0.0 

Identities = 497/499 (99%), Positives * 498/499 (99%) 

Query: 1 OTKFINSYGPLHLNLYV^QVSQDVTNNSSRVSWRATTO 60 

MTKF INS YG PLHLNL YVEQVSQD VTNNS S RVS WRATVDRDGAYRTWT YGNI SNLSVWLNG 
Sbjct: 1 MTKF INS YG PLHLNLYVEQVSQDVTNNSSRVSWRATVDRDGAYRTWTYGNI SNLSVWLNG 60 

Query: 61 SSVKSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMSVWASFDPNNGVHGNITISTNYTL 120 

SSVllSSHPDYDTSGEEVTIASGEVTVPHNSDGTKTMSWASFDPNNGVHGNITISTN^ 
Sbjct: 61 SSV^SSHPDYDTSGEEVTIoASGEVTVPHNSIXSTKTMSVWASFDPNNGVHGNITISTNY^ 120 

Query: 121 D S I P RST Q I S S F EGNRNLG S LHTV I FNRKVN S FTHQ VWYR VFG S D W I D LG KNHTT S VS FT 180 

DSIPRSTQISSFEGNRNLGSLHTVIFNRKWSFTHQVWYRVFGSDWIDLGKNHTTSVSFT 
Sbjct: 121 DSIPRSTQISSFEGNRNLGSLHTVIFNRKVNSFTHQVWYRVFGSDWIDLGKNHTTSVS^ 180 

Query: 181 PSLDLARYLPKSSSGTMDICIRTYNGTTQIGSDVYSNGWRFNIPDSVRPTFSGISLVDTT 240 

P S LDLARYLPKSS SGTMD I CI RT YNGTTQ IGSD VYSNGWRFNI PDS VRPTFSGI SLVDTT 
Sbjct: 181 PSU3LARYLPKSSSGTMDICIRTYNGTTQIGSDVYSNGWRFNIPDSVRPTFSGISLVDTT 240 

Query: 241 S AVRQ I LTGNN F LQ I MS N I Q VN FNNAS GA YG S T I QAFHAE LVG KNQ A I NENGGKLGMMN F 300 

SAVTIQILTGNNFLQIMSNIQVIJFNNASGAYGSTIQAFHAELVGKNQAINENGGKLGMMNF 
Sbjct: 241 SAVRQILTGNNFLQIMSNIQWFNNASGAYGSTIQAFHAELVGKNQAINENGGKLGMMNF 300 

Query: 301 NGSATVRAWVTDTRGKQSNVQDVS INVI EYYG PS INFS VQRTRQNPAI IQALRNAKVAP I 360 

NGS ATVRAWVTDTRGKQSNVQD VS INVI EYYG PS INFS VQRTRQN PAI IQALRNAKVAP I 
Sbjct: 301 NGSATVRAWVTDTRGKQSNVQDVS INVI EYYG PS INFSVQRTRQNPAI IQALRNAKVAP I 360 

Query: 361 WGGQQKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLMTNSSANLAGNYGPDKSYIV 420 

TVGGQQKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISL+TNSSANLAGNYGPDKSYIV 
Sbjct: 361 TVGGQQKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLLTNSSANLAGNYGPDKSYIV 420 

Query: 421 KAK I Q DR FT ST E FS ATVAT ES WLNYD KDG RLGVG KWEQG KAGS I D AAGD I YAGGRQ VQ 480 

KAKIQDRFTSTEFSATV TESWLNYDKDGRLGVGKWEQGKAGSIDAAGDIYAGGRQVQ 
Sbjct: 421 KAXIQDRFTSTEFSATVPTESVVLNYDKDGRIX3VGKVVEQGKAGSIDAAGDIYAGGRQVQ 480 

Query: 4 81 QFQLTDNNGALNRGQYNDV 4 99 

QFQLTDNNGALNRGQYNDV 
Sbjct: 481 QFQLTDNNGALNRGQYNDV 499 



Query* sid| 114827| lan| dplORF006 Phage dpi ORF| 45296-46987 1 2 
(563 letters) 

>gb|AAD18987| (AE001666) SWI/SNF family helicase_2 (Chlamydia pneumoniae) 
Length =1166 

Score = 171 bits (429), Expect * le-41 

Identities = 150/522 (28%), Positives * 254/522 (47%), Gaps = 55/522 (10%) 
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Query: 46 SSNNFE-LPYKYFNNVIDALDEWELHIFGELDKDVQDYIDSRNRIASSSNEQFSFKTTPF 104 

S + FE LP ♦ ♦+ + L E + I GE++ D QD ♦ T 

Sbjct: 659 SLDQFEALPVNF--SMSERLIEIQKQIRGEIEFDFQD VPQQ IQATLRSYQTEG 709 

Query: 105 AHQVECFEYAQEHPCFLLGDEQGLGKTKQAIDIAVSRKASFKH- -CLIVCCISGLKWNWA 162 

H +E + H +L D+ GLGKT QAI IAV+ + K C ++ C + L +NW 

Sbjct: 710 VHWLE- -RLRKMHLNGILADDMGLGKTLQAI - IAVTQSKLEKGSGCSLIVCPTSLVYNWK 766 

Query: 163 KEVGIHSNESAHILGSRVTKDGKLVIDGV-SKRAEDLLGGHDEFFLITNIETLRDAVFIK 221 

+E + E LVIDGV S+R + L D IT+ L+ V 
Sbjct: 767 EEFRKFNPEFR TLVIDGVPSQRRKQLTALADRDVAITSYNLLQKDV 812 

p. 

Query: 222 YLNELTKSGEIGWIIDEIHKCKNPSSKQGASIQKLQSYYKMGLTGTPLMNNPIDVFTTVM 281 

EL KS V++DE H KN +++ S++ +QS + ♦+ LTGTP+ N+ +++++ 

Sbjct: 813 ELYKSFRFDYWLDEAHHIKNRTTRNAKSVKMIQSDHRLILTGTPIENSLEELWSLF 869 

Query: 282 KW LG AE HHTLTQ F KERYC I VDQFNQI TGYR NLAELRELVNDYMLRRTKEEVL-DL 335 

+L . L +R+ V ++ + Y N+ L+ + V+ ++LRR KE+VX DL 

Sbjct: 870 DFLMPG LLSSYDRF- -VGKYIRTGNYMGNKADNMVALKKKVSPFILRRMKEDVLKDL 924 

Query: 336 PEKIRVTEYVDMNSKQSKIY KE VLT KL VQ E I D KVKLM PN P LAET I RLRQ ATGN 388 

p + + + Q ++Y K+ L++LV++ ++ + LA RL+Q + 

Sbjct: 925 PPVSEILYHCHLTESQKELYQSYAASAKQELSRLVKQEGFERIHIHVLATLTRLKQICCH 984 

Query: 389 PSILTTQDVK SCKFERCIEIVEECIQQGKSCVIFSbWEKVIEPLAKIL-SKTVKCNL 444 

p+l + S K++ ++++ ♦ G V+FS + K++ + K L S+ + 

Sbjct: 985 PAIFAKDAPEPGDSAJCYDMLMDLLSSLVDSGHKTVVFSQYTKML^ 1044 

Query: 445 VTGETADKFNEIEEFN^RKASVILGTIGAl^TGFTLTKADTVIFLDSPWTRAEKDQAED 504 

♦ G T + + + ♦ +F V L ++ A GTG L ADTVI D W A ++QA D 

Sbjct: 1045 LDGSTKNRI^LVNQFNEDPSLLVFLISLKAGGTGLNLVGADTVIHYDMWWNPAVENQATD 1104 

Query: 505 RCHRIGAKSS VT I YTLVAKGTVDERI EDLI ERKGELADYIVD 546 

R HRIG SV+ Y LV T++E+I L RK L +++ 
Sbjct: 1105 RVHRIGQSRSVSSYKLVTLNTIEEKILTLQNRKKSLVKKVIN 1146 

Query= sid| 114828 | lan| dplORF007 Phage dpi ORF| 22230-23621 | 3 
(463 letters) 

>gi | 2444105 (U88974) ORF26 (Streptococcus thermophilus temperate bacteriophage 
O1205] 

Length =* 411 
Score » 88.9 bits (217), Expect * 7e-17 

Identities = 80/315 (25%), Positives » 133/315 (41%), Gaps = 48/315 (15%) 

Query: 139 QGVTI^GIFCDEVAI^PESFVNQATGRCSVTGSKMWFSCNPANPNHYFKKNWIDKQVEKR 198 

+G T G + +E +L E ♦ RCS G+++ + NP NPNH+ + ++I K + + 
Sbjct: 121 RG FT AFG A YVNEAS LAN E L VFKE 1 1 S RC SGDGAR WWDS N PDN PNHWLNRD Y I G KN - DGK 179 

Query: 199 ILYLHFTMDDNPSLT DSIKRRYEKK^AGVFRKRFILGLWVTADGLVYSMFNEEQHV 254 

1+ F +DDN L+ DSIK K G F R ILGLW A+G +Y+ ♦ + + HV 
Sbjct: 180 I IDFS FKLDDNTFLSKRYIOS IKAATPK GKFYDRDI LGLWTVAEGAI YADYDSKIHV 236 

Query: 255 KKLNIEFDRLFVAGDFGIYNATTFGLYGFSKRHKRYHLIESYYHSGREAEEQLTEADVNS 314 

E R F D+G + ♦ + G ++L+ + +E + + +A 
Sbjct: 237 VDELPEMKRYFGGIDWGYTHYGSIVIVG-EGVDNNFYLVDGVAAQFKEIDWWVEQA 291 

Query: 315 NIQFSSVLQKTTKEYANDLVDMIRGKQIEYIILDPSASAMIVELQKHPYIAR IQJIPI 371 

+K T Y N + ♦ ++AR + I 

Sbjct: 292 RKLTGIYGN IPFYADSARPEHVARFENEGFDI 323 

Query: 372 IPARNDVTLGISFHAELLAENRFTLDPSNT-HDIDEYYAYSWDSKASQTGEDRVIKEHDH 430 

♦ A V GI A+L E + + DE Y Y W ++ +D +KE D 
Sbjct: 324 MNANKSVIAGIELIAKLFKEKKLYVKRGFVPRFFDEIYQYRWKENST KDEPLKEFDD 380 

Query: 431 CMD RNR Y AC LTD AL I 44S 

+D RYA +D +1 
Sbjct: 381 VLDSVRYAIYSDYVI 395 



Query- sid| 114829 | lan| dplORF008 Phage dpi ORF | 49624 -50961 | 1 
(445 letters) 

>gb|AAD1990l| (AF100420) DnaB replication fork helicase (Thermus aquaticus] 
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Length = 444 
Score =67.5 bits (162), Expect = 2e-10 

Identities = 69/248 (27%), Positives * 111/248 (43%), Gaps = 14/248 (5%) 

Query: . 147 GERLGISTGFEXXXXXXXXXXXXXXXIVIMARPGQGKS-WTIDKMLATAWKNGHDVLLYS 20S 

GE G+ TGF+ I I ARP GK+ + + A K G V +YS 

Sbjct: 178 GEVAGVRTGFKELDQLIGTLGPGSLNI - IAARPAMGKTAFALTIAQNAALKEGVGVGIYS 236 

Query: 206 GEMSEMQVGARIDTILS^SINSITKGIWNDHQFEKYEDHIQA^EAENSLVWTPFMIG 265 

EM Q+ R+ + + +N + G D F ♦ D ++EA + TP + 
Sbjct: 237 LEMPAAQLTLRMMCSEARIDMNRVTUiGQLTDRJDFSRLVDVASRLSEAP - IYIDDTPDLTL 295 

Query: 266 GKNLTPAILDSMISKYRPSWGIDQLSLMS- -ESYPSREQKRIQYANITMDLYKISAKYG 323 

+ A ++S+ ♦ ++ ID L LMS S S E. ++ + A 1+ L ++ + G 
Sbjct: 296 ME--V11ARARRLVSQNQVGLIIIDYLQLMSGPGSGKSGENRQQEIAAISRGLKALARELG 353 

Query: 324 IPIVLNVQAGRSAKTEGAESMELEHIAESDGVGQNASRVIAMKRJD EKSGILEL 376 

IPI + Q R+ + + L + ES + Q+A V+ + RD EK+GI E+ 

Sbjct: 354 IPIIALSQLSRAVEARPNKRPMLSDLRESGSIEQDADLVWFIYRDEYYNPHSEKAGIAEI 413 

Query: 3 77 SWKNRYG 384 

V K R G 
Sbjct: 414 IVGKQRNG 421 



Query- sidfll4831 | lan| dplORFOlO Phage dpi ORF| 8699-9859 | 2 
(386 letters) 

>gi | 2760912 (AF037258) RecA protein [Chlorobium tepidum] 
Length * 346 

Score = 133 bits (331), Expect = 2e-30 

Identities = 99/340 (29%) , Positives * 164/340 (48%), Gaps » 66/340 (19%) 

Query: 44 GGLPRKRWEFFGPESSGKTTSAIJDIvlCNAQMVFXXXXXXXXXXXXXXXXNARASKASKT 103 

GGLPR RV E +GPESSGKTT AL ♦ AQ 
Sbjct: 67 GGLPRGRVTEIYGPESSGKTTLALHAIAEAQ KNG 100 

Query: 104 AVKELEMQLDSLQEPLKIVYLDLENTl^TEWAKKIGVT)VI)NIWIvllPEMNSAEEILQYVL 163 

•f L +D E+ D +A+K+GVD++ + + +PE S E+ L V 

Sbjct: 101 GIAAL - - VDAEHAFDPTYARKLGVDINALLVSQPE- -SGEQALSIVE 143 

Query: 164 DIFETGEVGLVVLDSLPYMVSQNLIDEELTKKAYAGISAPLTEFSRKVTPLLTRYNAIFL 223 

♦ +G V ++V+DS+ +V Q ++ E+ + +++ RK+T +++ +++ L 

Sbjct: 144 TLVTlSGAVT)IIVIDSVAALVPQAELEGEMGDSVVGLQARLMSQALRiCLTGAISKSSSVCL 203 

Query: 224 G I NQ I REDMN S Q YNA - YS T PGGKMW KHACAVRLKFRKGD YLD ENG AS LTRTARN P AGNW 282 

INQ+R+ + Y ♦ +T GGK K +VRL RK + ++G L GN 
Sbjct: 204 FINQIJU3KIGWYGSPETTTGGKALKFYSSVRLDIRKIAQI - KDGEELV- - GNRT 255 

Query: 283 ESFVEKTKAFKPDRKLVSYTLSYHDGIQIENDLVDVAVEFGVIQKAGAWFSIVDLETGEI 342 

+ V K K P K + + Y +GI + +L+D+AVEFG+I+K+GAWFS + G 
Sbjct: 256 KVKWKNKV-APPFKTAEFDILYGEGISVl/SELIDLAV^FGIIKKSGAWFSYGTEKLG- - 312 

Query: 343 MTDEDEEPLKFQGKANLVRRFKEDDYLFDMVMTAVHEIIT 382 

QG+ N+ ♦ KED+ L + + V +++T 
Sbjct: 313 QGRENVKKLLKED ETLRNT I RQQVRDMLT 341 

Query* sid| 114832 | lan| dplORFOll Phage dpi ORF| 28017-29096 | 3 
(359 letters) 

>gi | 24 44110 (U8 8974) ORF31 (Streptococcus therraophilus temperate bacteriophage 
O1205] - 
Length a 348 

Score = 187 bits (469), Expect = le-46 

Identities = 118/358 (32%), Positives » 187/358 (51%), Gaps » 21/358 (5%) 

Query: 3 IYDYINAGEIASYIQALPSNALQYLGPTLFPNAQQTGTDISWLKGANNLPVTIQPSNYDA 62 

I YD + A IA Y AL N LG ++FP +Q GT +S++KGA+ V ++ + +D 
Sbjct: 4 I YDKVTASNIAGYFNALQENVSSTLGES I FPARKQLGTKLSYIKGASGQSVALKAAAFDT 63 

Query: 63 KASLRERAGFSKQATEMAFFRES^!RLGEKDRQNLQMLLNQSSA-IJAQPLITQLYNDTKNL 121 
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Sbjct: 


64 


Query: 


122 


Sbjct: 


124 


Query: 


182 


Sbjct: 


181 


Query: 


241 


Sbjct: 


236 


Query: 


301 


Sbjct: 


286 



♦+R+R +M FF+E+M + E DRQ L ♦ +A L ♦+ ++ND 



V+G A+ E MRMQ+L GK S Y D K+Q V+K W P + P+A 



D+ A+ + G+ P R V+N T+ + K+ S K + ♦ GS > + + E 



+IA+ G+ I ♦ + D G + +F DG + L+P +G+T +GTT 

tfYIADNFGVS I VLENGTYRN DKGEVSKF- - Y P DGHLTL I P NG P LGNTVFGTT 285 



PE DL+ T +A+V+++ G VTT PVN+ T VS V +PSFE +D V +LT 



Query= sid| 114834 | lan| dplORF013 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 

>ap|P09122|DP3X - BACSU DNA POLYMERASE III SUBUNITS GAMMA AND TAU 
Length ^ 563 

Score = 182 bits (458), Expect = 2e-45 

Identities * 118/353 (33%), Positives = 176/353 (49%) , Gaps = 31/353 (8%) 

Query: 7 YRPQTFEE WAQEYVKE I LLNQLQNGAI KHG YLFCXXXXXXXXXXXRI FAKDVN 60 

+RPQ FE+W QE++ + L N L H YLF +IFAK VN 

Sbjct: 10 FRPQRFEDWGQEHITKTLQNALLQKKFSHAYLFSGPRGTGKTSAAXIFAKAVNCEHAPV 69 

Query: 61 KGL GSPIEIDAASNNGVENVRNIIEDSRYKSMDSEFKVYIIDEVH 105 

KG+ IEIDAASNNGV+ +R*I + +♦ +KVYIIDEVH 

Sbjct: 70 DEPCNECAACKGITNGSISDVIEIDAASNNGVDEIRDIRDKVKFAPSAVTYKVYIIDEVH 129 

Query: 106 MLSTGAFNALLKTLEEPSSGTVFILCTTDPQKIPDTILSRVQRFDFTRIDNDDIVNQLQF 165 

MLS GAFNALLKTLEEP +FIL TT+P KIP TI+SR QRFDF RI + IV ++ 
Sbjct: 130 MLSIGAFNALLKTLEEPPEHCIFILATTEPHKIPLTIISRCQRFDFKRITSQAIVGRMNK 189 

Query: 166 I IESENEEGAGYSYERDAI^SFIGKLANGGMRDSITRLEKVXDYSHHVDMEAVSNAL- - -G 222 

I+++E E +L I A+GGMRJD+++ L++ + +S D+ V +AL G 

Sbjct: 190 IVDAEQ LQ VEEGS LEI I AS AAHGGMRD ALS LLDQAI S FSG - - D I LKVED ALL I TG 242 

Query: 223 VPDYETFASLV^LAIANYDGSKCLEIVTJDFHYSGKDLKLVTR^ 282 

L +++ + + S LE +N+ GKD + + + ++ Y ♦ 
Sbjct: 243 AVSQLYIGKIAKSLHDKNVSDALETLNELLQQGKDPAKLIEDMIFYFRDMLLYKTAPGLE 302 

Query: 2 83 I TQ L P AH FE S KLEQ FC EAFQ YPTLLWMLEEMN E LAG WKWE PNAK P 1 1 ETKLL 335 

+ + E L M++ +N+ +KW ♦ + E ++ 

Sbjct: 303 GVXEKvTCVTJETFRELSEQIPAQALYEMIDILNKSHQEMKWTNHPRIFFEVAVV -355 

Query- sid 1 114835 | lan| dplORF014 Phage dpi ORF| 50961-51974 | 3 
(337 letters) 

>sp|P47492|PRIM_WYCGE DNA PRIMASE >gi 1 1361496 1 pir | | F64227 DNA primase (dnaE) homolog 
MG250 - Mycoplasma genitalium (SGC3) >gi| 3844848 
(U39704) DNA primase (dnaE) {Mycoplasma genitalium] 
. Length » 607 

Score = 57.0 bits (135), Expect = 2e-07 

Identities = 53/190 (27%), Positives = 89/190 (45%), Gaps * 17/190 (8%) 

Query: 146 EELDKYRFIHP YMYERKLTDELIEMFDVGYDK- -LHDCITFPVRNLKGETVFF 196 

E +++Y FI + P Y++ K ++FD K > I P+ + G V F 

Sbjct: 170 ESMERYPFINPKIKPSELYLFS-KTNQQGLGFFDFNTKKATFQNQIMIPIHDFNGNPVGF 228 

Query: 197 NRRSVRSKFHQYGEDDPKTEFLYGQYELVAFRDYFEKPISQVFVTESVINCLTLWSMKIP 256 

+ RSV + ++ EF ♦ ♦ EL+ K ++Q+F+ E ♦ TL + K 

Sbjct: 229 SARSVTININKLKYKNSADHEF-FKKGELLFNFHRLNKNLNQLFIVEGYFDVTTLTNSKFE 287 

Query: 257 AVALMGVGGGN-QINLLKR- -LPYRNIVXALDPDNAGQTAQEKLYRQLKRSK-VVRFLNY 312 m 

AVAUHG+ + QI +K + +VLALD D +GQ A L +L + +V + + 

Sbjct: 288 AVAIJ4GLAIJNDVQIKAIKAHFKELQTLvTJU^^ 347 
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Query: 313 PKEFYDNKWD 322 
D WD 

Sbjct: 348 EHNYKD--WD 355 



Query= aid | 114837 | lan | dplORF016 Phage dpi ORF| 43413-44303 | 3 
(296 letters) 

>emb|CAB07986| (293946) N-acetylmuramoyl-L-alanine amidase (bacteriophage Dp-1] 
Length * 296 

Score = 661 bits (1686), Expect = 0.0 

Identities = 296/296 (100%), Positives = 296/296 (100%) 

Query: 1 MGVT)IEKGVAWMQARKGRVSYSMDFRDGPDSYDCSSSMYYALRSAGASSAGWAVOTEY^ 60 

MG VD I EKG VAWMQARKG R VS YS MD F RJDG P D S YDC S S SMYYALRS AG AS S AGWA VNT E YMH 
Sbjct: 1 MGVD I EKGVAWMQARKGRVS YSMD FRDG PDS YDCS S SMYYALRS AGAS SAGWAVNT E YMH 60 

Query: 61 AWL I ENG YE LIS ENAPWD AKRGD I F I WGRKGAS AGAGGHTGMF I DS ONI IHCNYAYDG I S 120 

AWL I ENG YE LIS EN A P WD AKRGD I F I WG RKG AS AG AGGHTGMF I D S D N 1 1 HCNY A YDG I S 
Sbjct: 61 AWL I ENG YELI S ENAPWD AKRGD I F I WGRKGAS AGAGGHTGMF I DS ON 1 1 HCNYAYDG I S 120 

Query: 121 VND HDERWYYAGQ P YYYVYRLTNANAQ P A£ KKLG WQ KD ATG FWYARANGTY PKDEFEYIE 180 

VTTOHDERWYYAGQPYYYVYRLTNANAQPAEKJCLGWQKDATGFWYARANGTYPKDEFEY 
Sbjct: 121 VNDHD ERWYYAGQ P YYYVYRLTNANAQ P A£ KKLG WQ KD ATG FWYARANGTY P KD E F E Y I E 180 

Query: 181 ENKSWFYFDDQGYMIAEKWLKHTDGNWYWFDRiXJYMA^ 240 

ENKSWFYFDDQGYMrJ^KWLKHTDG^mWFDRDGYMATSWKRIGESWYYFNRDGS^WTGW 
Sbjct: 181 ENKSWFYFDDQGYMLAEKWLKHTDGNWYWFDRDGYMATSWKRIGESWYYFNRDG 24 0 

Query: 241 IKYYDNWYYCDATNGDMKSNAFIRYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV 296 

IKYYDNWYYCDATNGDMKSNAFIRYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV 
Sbjct: 241 IKTYYDNWYYCDATNGDMKSNAFIRYNDGWYLLLPDGRIJVDKPQFTV^PDGLITAKV 296 

Query* sid| 114 841 1 lan | dplORF020 Phage dpi ORF 1 1864 -2658 1 1 
(264 letters) 

>emb|CAB13247| (Z99111) similar to coenzyme PQQ synthesis (Bacillus subtilis] 
Length = 243 

Score = 217 bits (548), Expect = 5e-56 

Identities = 117/248 (47%), Positives = 163/248 (65%), Gaps = 15/248 (6%) 

MPIMEIFGPTIQGEGMVIGQKTIFIRTGGCDYHCNWCDSAFTWNGTTEPE--YITGKEAA 80 
+P++EIFGPTIQGEGMVTGQKT+F+RT GCDY OWCDSAFTW+G+ + + + +T +E 



D G +HVT++GGNPAL+ + + I +LKE+ «• LETQGT +Q+WF 
- KDIGGDAFSHVT I SGGNP ALLKQ - LDAF I ELLKENN I RAALETQGTVYQDWF 118 



Query: 


23 


Sbjct : 


5 


Query: 


81 


Sbjct: 


65 


Query: 


141 


Sbjct: 


119 


Query: 


199 


Sbjct: 


179 


Query: 


257 


Sbjct : 


236 



+ D+TISPKPPSS MTN + L+I+ + ND S K+VIF++ DL +A+ + K 

TLIDDLTISPKPPSSKMVTNFQKLDHILTSLQENDRQHAVSLKWIFNDEDLEFAKTVHK 

TFEGKLRPVNYLSVGNANAY--EEGKISDRLLEKIX5WLWDKVYEDPAFNNVTIPLPQLHTL 
+ G YL VGN ♦ + ++ + LL K L DKV D N VR LPQLHTL 

RYPG- - -IPFYLQVGNDDVHTTDDQSLIAHLLGKYEALVDKVAVDAELNLVRVLPQLHTL 

VYDNKRGV 264 
+ + NKRGV 
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Query* sid| 114842 | lan| dplORF021 Phage dpi ORF| 2504 -3295 | 2 
(263 letters) 

>ap|P1946S|GCHl BACSU GTP CYC LOHYD RO LAS E I (GTP-CH-I) >gi | 98411 1 pir | | A3 8256 GTP 
cyclohydrolase I (EC 3.5.4.16) - Bacillus aubtilia 
>gi | 143231 (M37320) regulatory protein [Bacillus 
aubtilis] >gi| 143799 (M8024S) MtrA [Bacillua aubtilia} 
>gi|2634696|emb|CAB14194| (Z99115) GTP cyclohydrolase I 
(Bacillus subtilis] 
Length = 190 

Score = 208 bits (523), Expect « 4e-53 

Identities » 103/185 (55%), Positives » 133/185 (71%), Gaps = 1/185 (0%) 

Query: 80 VTLDNT EAA VQRL FG LLG E D AERDG LQDT P F R FVKAIAE HTVG YR ED P KLH LEKT FD VD H 139 

V + E AV+ + + +GED R+GL DTP R K AE G EDPK H + F +H 
Sbjct: 4 VNKEQIEQAVRQILEAIGEDPNREGLLDTPKRVAKMYAEVFSGLNEDPKEHFQTIFGENH 63 

Query: 14 0 EDLVLVKDIPFNSLCEHHLAPFVGKVHIAYIPKD-KITGLSKFGRWEGYAKRLQVQERL 198 

E+LVLVKDI F+S+CEHHL PF GK H+AYIP+ K+TGLSK R VE AKR Q+QER+ 
Sbjct: 64 E E LVLVKD I AFH S MC E HHLV P FYG KAHV AY I P RGG KVTG LS KLARAVEAVAKR PQ LQE R I 123 

Query: 199 TQQ I ADA I Q E VLN PQ A VAVI VEAE HT CMSGRG I KKHGATTVTS TMRG L FQD D AS ARAE LL 258 

x IA+ + I E L+P V V+VEAEH CM-f RG++K GA TVTS +RG+F+DDA+ARAE+L 
Sbjct: 124 TSTIAES IVETLDPHGVMVWEA£HMCMTMRG\filKPGAKTVTSA 183 

Query: .259 QLIKK 263 
♦ IK+ 

Sbjct: 184 EHIKR 188 

Query* sid| 114843 |lan|dplORF022 Phage dpi ORF | 30896-31675 | 2 
(259 letters) 

>gi | 2347102 (U77367) internalin (Listeria monocytogenes) 
Length ■ 821 

Score » 55.0 bits (130), Expect - 5e-07 

Identities = 44/149 (29%), Positives = 63/149 (41%), Gaps * 13/149 (8%) 

Query: 119 FRMNIYVPNYVG- -DSIVKYVXITLNNCTGKAPGLSIGKEFYAPEFNIKAREATKAGLPV 176 

p + VPN + D + + NN T AP L Y PE +K + K + 

Sbjct: 383 FSKTLSVPNNITSIDGTLIAPETISNNGTYDAPNLKWSLP^LPE- -VKYTFSQKIPIGT 440 

Query: 177 KSMDYVAQLPAVLR RVTFDLNGGTGTADAVRVEAGKKISPKPVDPTLTGKAFKGW 231 

+ + Y + L+ +VTF++ G T + V E +■ P+P PT G F GW 
Sbjct: 441 GTSKYSGFITQPLKELLDYKVTFNVEGNTSEVETVTEE NLIPEPTSPTKQGYTFDGW 497 

Query: 232 - KVEG E S T I WD FDNHMM P DRD VKL V AQ FA 259 

E T WDF MP D+ L A F+ 
Sbjct: 498 YDAETGGTKWDFTTGQMPANDLTLYAHFS 526 

Score = 43.4 bits (100), Expect » 0.002 

Identities - 47/195 (24%). Positives * 73/195 (37%), Gapa * 12/195 (6%) 

Query 72 YDLTFKDtTTFDPEIMALIEGGTVTlQQGGTIAGYDT- PMLAQGASNMKPFRMNIYVPNY- - 128 

YD + T + +G ♦ GG + T MA ♦ F +N Y N+ 

Sbjct: 547 YDALLNEPTTPTKQGYTFDGWYDAETGGNKWDFKTMK^ 606 

Query- 129 VGDS I VNYVKITLNNCTGKAPGLS IGKEFYAPEFNI KAREATKAGLPVKSMD YVAQL 185 

V ♦ ♦ Y + T G ♦ ♦ A K TK+P + A 

Sbjct: 607 DGE VKNET I A YDT LLN E PTT PT KQG YT FDGWYD AETGGTKWD F KTKE - M P AND VT L YAH F 665 

Query 186 P AVLRRVT F D LNGGTGTADA VR VEAG KK I S P K PVD PTLTG KAFKGW - KVEG EST I WD FDN 244 

' * + FD++G T + V +A + P+P P+ TG +GW E T WDF 
Sbjct: 666 TINNYQANFDIDGAV-TEEWNYDA LIPEPTSPSKTGFTLEGWYDAEVGGTKWDFKT 721 

Query: 245 HMMPDRDVKLVAQFA 259 

MP D+ L A F+ 
Sbjct: 722 MKMPANDITLYAHFS 736 

Score * 38.3 bits (87), Expect * 0.057 

Identities * 42/169 (24%), Poaitivea - 59/169 (34%), Gapa - 10/169 (5%) 
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Query: 96 QQGGTIAGYDT-PMLAQGASNMKPFRMNIYVPNYVGDSIVNYVKIT LNNCTGKAPG 150 

+ GGT + T MA + F+NYN+O+V * . LN T 
Sbjct: 501 ETGGTKWDFTTGQMPANDLTLYAHFSVNSYQANFDIDGVVTNEAVVYDAXiLNEPTTPTKQ 560 

Query: 151 LS I G KE F YAP E FN I KARE AT KAG L P VKSMD YVAQ L P AVLRRVT FD LNGGTGTAD AVRVEA 210 

•f-YE ♦ +P++A + FD++G A 
Sbjct: 561 G YT FDG WYDAETGGNKWD FKTMKM P AND VAF YAHFT I NNYQAN FD I DG E VKNET I A 616 

Query: 211 GKKI S PKPVDPTLTGKAFKGW - KVEGESTI WDFDNHMMPDRJDVKLVAQF 258 

+ +P PT G F GW E T WDF MP DV L A F 
Sbjct: 617 YDT LLNE PTT PTKQGYT FDGWYD AETGGT KWD F KTKEM P AND VT L YAH F 665 



Query= aid] 114850 | lan|dplORF029 Phage dpi ORF | 662-1348 | 2 
(228 letters) 

>gi| 2650185 (AE001074) succinoglycan biosynthesis regulator (exsB) 
(Archaeoglobus fulgidus] 
Length =239 

Score = 119 bits (295), Expect = 2e-26 

Identities = 79/224 (35%), Positives = 113/224 (50%), Gaps * 11/224 (4%) 

Query: 1 MKSWLLSGGVDSATClJVIEVDKWGSK2miAIAFNYGQKHEA£LENAA>rVAMFYGVKFTI 60 

MKW+LLSGG+DS+T L +D G VHA+ F YGQKH E+E+A VA V+ 
Sbjct: 1 MKAVMLLSGGIDSSTLLYYLLD- -GGYEVTiAIjTFFYGQKHSKEIESAEICVAKAAKVRHLK 58 

Query: 61 LE I DS KI YXXXXXXLLQG KG E I SHGKS YAE I LA£ KEWDTYVP FRNGLMLSQXXXXXXXX 120 

++I.S 1+ L G+ E+ Y+E + + T VP RN ++LS 
Sbjct: 59 VDI-STIHDLISYGALTGEEEVPKA-FYSEEVQRR TIVPNRNMILLS- - IAAGYAV 110 

Query: 121 XXXXXXXXXXXXXXXXXXXPDCTPEFYNSMSNAMEYGT-GGKVTLVAPLLTLTKAQVVKW 179 

PDC EF ♦+ A+ V + AP + +TKA +V+ 

Sbjct: 111 K I G AKE VHYAAHLS D YS I YP DCRKE FVKALDT A VYLAN I WT P VE VRAP FVDMT KAD I VRL 170 

Query: 180 GIDLDVPYFLTRSCYESDAESCGTCATCIDRKKAFEENGMTDPI 223 

G+ L VPY LT SCYE C +C TC++R +AF NG+ DP+ 

Sbjct: 171 GLKLGVPYELTWSCYEGGDRPCLSCGTCLERTEAFLANGVKDPL 214 



Query= sid| 114855 | lan | dplORF034 Phage dpi ORF| 131-652 | 2 
(173 letters) 

>emb|CAB1324 8| (Z99111) similar to hypothetical proteins [Bacillus subtilis] 
Length » 165 

Score * 220 bits (556) , Expect » 4e-57 

Identities = 103/13? (74%), Positives - 117/139 (84%) 

Query: 5 TTRTD AE LTG VT LLGNQDTKYD YD YN PD VLET F PNKH P ENNYL VT FDG YE FTS LC PKTGQ 64 

TTR ++EL GVTLLGNQ T Y + +Y PDVLE+FPNKH +Y V F + EFTSLCPKTGQ 
Sbjct: 2 TTRKESELEGVTLLGN^TNYLFEYAPDVIjESFPNKHv>nU)YFVKFNCPEFTSLCPKTC 61 

Query: 65 PDFANVFISYIPNEKMVESKSLKLYLFSFRNHGDFHEDCMNIILNDLYELMEPKYIEVMG 124 

PDFA + + ISYIP+EKMVESKSLKLYLFSFRNHGDFHEDCMNII+NDL ELM+P+YIEV G 
Sbjct: 62 PDFATIYISYIPDEKMVESKSLKLYLFSFRNHGDFHEDCMNIIMNDLIELMDPRYIEVWG 121 

Query: 125 LFTPRGGISIYPFVNKVNP 143 

FTPRGGISI P+ N P 
Sbjct: 122 KFTPRGGISIDPYTNYGKP 140 



Query* sid| 114857 | lan|dplORF036 Phage dpi ORF| 48808-49362 | 1 
(184 letters) 

>gi| 1353529 (U38906) ORF12 (Bacteriophage rlt] 
Length = 296 

Score =53.5 bits (126), Expect * le-06 

Identities = 42/149 (28%), Positives » 70/149 (46%), Gaps » 9/149 (6%) 

Query: 34 I ASNTVGNG KTS WAVRL LQR YLAETALDG R I VE KGMFWS AQL LT E FGD YNY FQTMQ E F L 93 

♦ S G GK+ A+ +L+. LTL ++ V + F + + F + + F+ 

Sbjct: 155 WSGPAGTGKSHLAMSILKDCLQHTDLT- -VIFASWSEVLHLIKDSFDNKDSFYSTEYFM 212 
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Query: 94 ERFERLKTCELLVIDEIGGGSLTKASYPYLYDLVNYRVDNNLSTIYTTNYTDDEIIDLLG 153 

E F + +LLVID+IG +T+ S L R TI TTN DEI 

Sbjct: 213 EVF RNTOLLVIDDIGSEKITEWSMSLLTEVLDART KTIITTNLKSDEIRKKYH 265 

Query: 154 QRLYSRIYDTSWLDFQASNVRGLEVSEI 182 

R YSR+ + F N+ + VS + + 

Sbjct: 266 NRTYSRLFRGIGKKAFNFENIKDKRVSQL-294 

Query* sid| 114859 | lan|dplORF038 Phage dpi ORF| 1350-1871 | 3 
(173 letters) 

>9p|P44l23|YB90_HA£IN HYPOTHETICAL PROTEIN HI1190 >gi }l074675 | pir | | F64021 hypothetical 
protein HI1190 - Haemophilus influenzae (strain Rd KW20) 
>gi | 1574117 (U32798) 6-pyruvoyl tetrahydrobiopterin 
synthase, putative [Haemophilus influenzae Rd] 
Length =141 

Score - 100 bits (247) , Expect « 6e-21 

Identities = 59/143 (41%), Positives = 83/143 (57%), Gaps = 10/143 (6%) 

Query: 2 R VS KT LT FD AAHQ L VGH FG KC AN LHGHT YKVE I S LAGGTYDHG S S QGMWD F YHVKKI A - 60 

+ +SK +FD AH L GH GKC NLHGHTYK+++ ++G Y G+ + MV+DF +K I 
Sbjct: 3 KISKEFSFDMAHLLDGHDGKCQNLHGHTYKLQVEISGDLYKSGAKKAMVIDFSDLKSIVK 62 

Query: 61 GTFIDRLDHAVLL-QGNEP IALANAVDTKRVLFGFRTTAENMSRFLTVTTLTELMWK 115 

♦D +DHA + Q NE L +++K FRTTAE ++RF+ L + 

Sbjct: 63 KVILDPMDHAFIYDQTNERESQIATLLQKLNSKTFGVPFRTTAEEIARFIFNRLKH- - DE 120 

Query: 116 HARIDSIKLWETPTGCAECTYYE 138 

I SI+LWETPT + C Y E 
Sbjct: 121 QLSISSIRLWETPT- -SFCEYQE 141 

Query- sid| 114860 | lan| dplORF039 Phage dpi 0RF| 3306-3803 | 3 
(165 letters) 

>emb|CAA68244 | (X99978) ORF7; hydophobic protein [Lactobacillus plantarum] 
Length » 168 

Score = 64.4 bits (154), Expect =* Se-10 

Identities =* 49/156 (31%), Positives * 84/156 (53%), Gaps = 9/156 (5%) 

Query: 8 WLVRTALIAALYVTLTVAFSAISY- -GPIQFRVSEALILLPLWNHRWTPGIVLGTIIANF 65 

W++ AL+AA+YV L + +A S G IQFRVSE L L ++N + + GIV G 1+ + 
Sbjct: 9 W I IN - AL VAAMYWLC LG P AAFS LASG AI Q F RVS EG LNHLA VFNRKY I WG I VAG V I L FD A 67 

Query: 66 FSP-LGLIDVLFGSLATFLGXXXXXXXXXXXSPLYSLICPVLA NAYLIALELRIVY 120 

F P L++VLFG + L ++ ♦ +A ♦ + + IAL + + + 

Sbjct: 68 FGPGASLLNVXFGGGQSLLALLVXTWLAPKLKTWQRMLLNIAIiFTV 127 

Query: 121 S-LPFWESVTYVGISEAIIVLISYFLISTLAKNNHF 155 

S + FW + +■ +SE 11+ 1+ ♦ + +L + HF 
Sbjct: 128 SG VA FW PTYLTT ALS E L 1 1 MS I TAP I MYS LDR VLH F 163 

Query- s id | 114862 | lan |dplORF041 Phage dpi ORF | 8208 -8699 | 3 
(163 letters) 

>gi | 2522313 (AF012906) dUTPase homolog [Bacillus subtilis) 
>gi|2634394|emb|CAB13893| (Z99114) similar to 
deoxyuridine 5 * -triphosphate nucleotidohydrolase 
[Bacillus subtilis] >gi| 3025643 (AF020713) putative 
dUTPase (Bacteriophage SPBc2j 
Length = 142 

Score » 108 bits (267) , Expect * 2e-23 

Identities = 65/160 (40%), Positives = 83/160 (51%), Gaps - 25/160 (15%) 

Query: 5 VDVKMIDPKLDRLKYT- -GDWVDVRISSITKIDADSADVSRCRKVXQKAQVYSVAAGECI 62 

♦ +K +D R+ GDW+D+R + I D + 
Sbjct: 3 I K I KYLD ETQTR I NKMEQGDW I D LRAAED V A I KKD E F KL 41 

Query: 63 KIAHGFALELPKGYEAILHPRSSLFKKTGLIFVSS-GVIDEGYKGDTDEWFSVWYATRDA 121 

+ G A+ELP+GYEA ♦ PRSS +K G+I +S GVIDE YKGD D WF YA RD 
Sbjct: 42 -VP LGV AME L P EG YEAHW P RS ST YKU FGVI QTNSMG V IDE S YKGDND FWF F P A YALRDT 100 
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Query: 122 DIFYDQRIAQFRIQEKQPAIKFNFVESLGNAARGGHGSTG 161 

I RI QFRI +K PA+ V+ LGN RGGHGSTG 
Sbjct: 101 KIKKGDRICQFRIMKKMPAVDLIEVDRLGNGDRGGHGSTG 140 

Query= sid| 114867 | lan | dplORF046 Phage dpi ORF | 42774 -43202 | 3 
(142 letters) 

>emb|CAB07984| (Z93946) hypothetical protein {bacteriophage Dp-1] 
Length 3 142 

Score =• 287 bits (728), Expect = 2e-77 

Identities = 142/142 (100%), Positives =» 142/142 (100%) 

Query 1 MPKWLNDTAvXTTIITACSGvT,™ 60 
MPMWL^TA^TTIITACSGVLTVLI^KiFEWKSNKAKSVLEDISTTLSTLKQQ^GIDQ 

Sbjct: 1 MPMWI^DTAVLTTIITACSGVLTvTiLNKLFEWKSNKA^ 60 

Query- 61 TTVAINHQNDVIQDGTRKIQRYRLYHDLKREVITGYTTLDHFRELSILFESYKNLGGNGE 120 

TTVAimiQtroVIQOGTRKIQRYRLYHDLKREVITGYTTLDHFRELSILFESYKNLGGNGE 
Sbjct: 61 TTVAI NHQND V I QDGT RK I Q R YRL YHDLKREV I TG YTTLD HF RE LS I L F E S YKNLGGNG E 120 

Query: 121 VEALYEKYKKLPIREEDLDETI 142 

VEALYEKYKKLPIREEDLDETI 
Sbjct: 121 VEALYEKYKKLPIREEDLDETI 142 

Query* sid| 114901 | lan| dplORFOSO Phage dpi ORF| 42490-42759 | 1 
(89 letters) 

>emb|CAB079 83| (Z93946) hypothetical protein (bacteriophage Dp-1] 
Length = 124 

Score = 147 bits (367), Expect = le-35 

Identities = 7S/7S (100%), Positives = 7S/75 (100%) 

Query 1 MLNLTKS RQ I VAE FT I GQGAE KKL VKTT I VN I DAN AVSTVS ETLHD PDLYAANRRE LRAD 60 

MLNLTKSRQIVAEFTIGQGAEKKLVKTTIVNIDANAVSTVSETLHDPDLYAANRRELRAD 
Sbjct: 1 MLNLTKS RQ I VAE FTIGQGAEKKLVKTT I VN I DANAVSTVSETLHD PDLYAANRRE LRAD 60 

Query: 61 EQKLRETRYAI EDEI 7S 

EQKLRETRYAI EDE I 
Sbjct: 61 EQKLRETRYAI EDE I 75 

Query* sid| 114912 | lan| dp!0RF091 Phage dpi ORF| 43189-43413 | 1 
(74 letters) 

>emb|CAB07985| (Z93946) holin (bacteriophage Dp-1] 
Length = 74 

Score = 63.2 bits (151), Expect * 2e-10 
Identities » 34/74 (45%), Positives = 34/74 (45%) 

Query 1 hUaSNEQYDXXXXXXXXXXX^^ 60 

MKLSNEQYD YQFD VLGVSSR 

Sbjct: 1 MKLSNEQYD VAKNVVTVWPAAIAL I TGIX3ALYQFDTTAITGT I ALLATFAGTVLGVSSR 60 

Query: 61 NYQKEQEAQNNEVE 74 

NYQKEQEAQNNEVE 
Sbjct: 61 NYQKEQEAQNNEVE 74 
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Condensed listing of homology information from above 



Phage: dpi 
Database: nr 
Program: Blastp 

Query* sid| 114822 | lan|dplORF001 Phage dpi ORF| 36698-4Q390 | 2 
(1230 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



2444124 (U88974) ORF45 (Streptococcus thermophilus temperate ... 467 e-130 

928828 (L44S93) ORF1904; putative [Lactococcus lact is phage B. . . 427 e-118 

2935676 (AF032121) unknown [Streptococcus thermophilus bacter. . 309 le-82 

2935691 (AF032122) unknown (Streptococcus thermophilus bacter... 306 7e-82 

3540289 (AF057033) putative ant i-receptor (Streptococcus ther. . . 279 6e-74 

4S3O154|gb|AA021894.l| (AF085222) putative tail-host specific... 220 3e-S6 

930045|emb|CAA33387| (X15332) alpha-1 (III) collagen (Homo sa. . . 58 4e-07 

1070603 | pir | |CGHU7L collagen alpha l(III) chain precursor - h. . . 58 4e-07 

4502951 1 ref | NP_000081 . 1 1 PCOL3A1 1 collagen, type III, alpha 1 ... 58 4e-07 

115290 |sp|P042S8|CA13_BOVIN COLLAGEN ALPHA l(III) CHAIN >gi|7... 58 4e-07 

575322|emb|CAA36279| (X52046) type III collagen (Mus musculus] 57 8e-07 

2119163 |pir| |S59856 collagen alpha l(III) chain precursor - ra. . . 57 8e-07 

543 912 | 3p| P13941 1 CA13_RAT COLLAGEN ALPHA l(III) CHAIN >gi | 543 .. . 57 le-06 

3171998|emb|CAA06510| (AJ005395) collagen alpha 1 (III) (Ratt... 57 le-06 

3947565 |emb|CAA90250| (Z49967) similar to collagen; cDNA EST ... 54 7e-06 

423403|pir| |A46053 bullous pemphigoid antigen, BPAG2, type XV... 53 9e-06 

115410 | sp | P12114 |CCS1_CAEEL CUTICLE COLLAGEN SQT-1 >gi | 84437 | .. . . 53 9e-06 

3873801 1 emb| CAA900 84 | (Z49907) cuticle collagen SQT-1; cDNA E. . . 53 9e-06 



Query* sid| 114823 | lan| dplORF002 Phage dpi 0RF| 32386-35835 | 1 
(1149 letters) 



3341922|dbj 
4126622|dbj 
1369948|emb 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



BAA31888| (AB009866) orf 15 [bacteriophage phi PVL] 280 3e-74 

BAA36642.1| (AB016282) 0RF36 [bacteriophage phi-105l 232 le-59 

~ _,CAA59194| (X84706) host interacting protein [Bact... 201 3e-50 

3139112' (AF063097) gpT [Bacteriophage P2] 188 2e-46 

3337272 (U32222) G protein [Bacteriophage 186) 161 3e-38 

4063799|dbj |BAA36253| (AB008550) orf2S; similar to T gene of ... 159 8e-38 

3172274 (AF022214) minor tail subunit; putative tape-measure ... 123 6e-27 

465127|sp|Q05233|VG26_BPMLS MINOR TAIL PROTEIN GP26 >gi| 41904... 108 2e-22 

3540284 (AF057033) putative minor tail protein [Streptococcus... 99 2e-19 

2444119 (U88974) ORF40 (Streptococcus thermophilus temperate ... 90 6e-17 

263455S|emb|CAB14053| (Z99115) yoml (Bacillus subtilisj >gi|3... 66 le-09 

2392838 (AF011378) unknown [Bacteriophage ski] 64 Se-09 

2764873|emb|CAA66S57| (X97918) gene 18.1 [Bacteriophage SPP1) 62 3e-08 

1353559 (U38906) ORF42 [Bacteriophage rlt] 61 6e-08 

|S39079 puff C-8 protein - fungus gnat (Rhynchosci . . . 55 2e-06 

P5173l|Y027_BPHPl HYPOTHETICAL 72.8 KD PROTEIN IN . . . S3 8e-06 

11101273J ORF 7 [Bacteriophage HP1) 53 le-05 



118825 I Sp I P00582 I DP01_ECOLI DNA POLYMERASE I (POL I) >gi|6705... 193 3e-48 

2982102|pdb|lKFS|A Chain A, All-Oxygen Dna Complexed To The 3... 193 3e-48 

229889|pdb|lDPl| DNA Polymerase I (Klenow Fragment) (E.G. 2 193 3e-48 

1169402|sp|P4374l|DPOl HAEIN DNA POLYMERASE I (POL I) >gi|l07... 191 le-47 

2688462 (AE001156) DNA polymerase I (polA) [Borrelia burgdorf . . . 190 3e-47 

809180|pdb|lKLN|A Escherichia coli r 190 3e-47 

1913934|emb|CAA72997| (Y12328) DNA-directed DNA polymerase I ... 189 8e-47 

4090935 (AF028719) DNA polymerase type I [Rhodothermus sp. »I... 175 le-42 

473157l|gb|AAD28505.l|AF121780_l (AF12 1780) DNA polymerase I ... 174 2e-42 

1633576 (U577S7) similar to proofreading 3'-S' exonuclease an... 173 4e-42 

3322368 (AE001195) DNA polymerase I (polA) (Treponema pallidum] 172 9e-42 

1006595|dbj |BAA10748| (D64005) DNA polymerase I [Synechocysti . . . 171 2e-41 

58S062|sp|Q07700|DPOl_MYCTU DNA POLYMERASE I (POL I) >gi|4161... 163 5e-39 

4376908 |gb|AAD187Sl I (AE001645) DNA Polymerase I [Chlamydia p... 157 2e-37 

1169403 |sp|P4683S|DP01_MYCLE DNA POLYMERASE I (POL I) >gi|l07... 152 7e-36 

2145839|pir| |S72949 DNA polymerase I - Mycobacterium leprae >... 152 7e-36 

1405438|emb|CAA67184| (X98575) DNA- dependent DNA polymerase (..- 152 9e-36 

2506365|sp|P80194|DPOl THECA DNA POLYMERASE I, THERMOSTABLE (... 147 2e-34 

3328929 (AE001322) DNA~Polymerase I [Chlamydia trachomatis] 147 3e-34 
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1 3913S10 | sp| 052225 |DP01_THEFI DNA POLYMERASE I, THERMOSTABLE (... 146 

| 1205984 (U33536) DMA polymerase I (Bacillus stearothermophilus] 146 

|118827| sp| P13252 |DP01_STRPN DNA POLYMERASE I (POL I) >gi|9802... 145 

| 1942202 | pdb| 1JXE| Stoffel Fragment Of Taq Dna Polymerase I 145 

|l943520|pdb|lKTQ| Dna Polymerase 145 

| 1084022 |pir | | JX0359 DNA-directed DNA polymerase (EC 2.7.7.7) ... 145 

j 507891 | dbj | BAA06775 | (D32013) DNA polymerase (Thermus aquaticus] 145 

| 118828 |sp|P1982l|DP01JTHEAQ DNA POLYMERASE I, THERMOSTABLE (T. . . 145 

| 1706502 |sp|P52028|DP01JTHETH DNA POLYMERASE I , THERMOSTABLE (... 144 

| 1097211 jprf | |2113329A DNA polymerase {Thermus aquaticus therm... 144 

|2098289|pdb|lTAU|A Chain A, Structure Of Dna Polymerase 143 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 

Query* sid 1 114825 | lan| dplORF004 Phage dpi ORF| 40401-42440 | 3 
(679 letters) 

gi| 193476l|emb|CAB0798l| (Z93946) hypothetical protein [bacterio. . 
gi j 3540290 (AF057033) putative minor structural protein [Strepto. . 
gi|2444125 (U88974) ORF46 [Streptococcus thermophilus temperate .. 
gi| 1934762 | emb|CAB07982 | (Z93946) hypothetical protein (bacterio.. 
gi|453015S|gb|AAD21895.l| (AF085222) unknown [Streptococcus ther. . 
gi j 2935677 (AF032121) unknown [Streptococcus thermophilus bacter.. 
gi|2935692 (AF032122) unknown [Streptococcus thermophilus bacter.. 
gi|H36289 (U42S97) histidine kinase A (Dictyostelium discoideum] 

Query= sid| 114827 | lan| dplORF006 Phage dpi ORF | 45296-46987 | 2 
(563 letters) 



4377165 | gb| AAD18987 | (AE001666) SWI/SNF family helicase_2 (Ch. 
1769947|emb|CAA67095| (X98455) SNF [Bacillus cereus] 
3329163 (AE001341) SWF/SNF family helicase (Chlamydia trachom. 
4377149|gb|AAD18973| (AE001664) SWI/SNF family heiicase_l (Ch. 
332899S (AE001326) SWI/SNF family helicase (Chlamydia trachom. 
2493354 | sp| P75093 |Y018_MYCPN HYPOTHETICAL HELICASE MG018/MG01. 
16 5374 8 | dbj | BAA18 65 9 | (D90916) helicase of the snf2/rad54 fam. 
1763712|emb|CAB05939| (Z83337) member of the SNF2 helicase fa. 
2636153|emb|CAB15645.l| (Z99122) similar to SNF2 helicase (Ba. 
2909552 | emb|CAA17284 | (AL021924) helZ [Mycobacterium tubercul. 
3844627 (U39681) ATP-dependent RNA helicase, putative [Mycopl. 
1351463|sp|P47264|Y018_MYCGE HYPOTHETICAL HELICASE MG018 
2660669 (AC002342) human Mi-2 autoant igen-like protein [Arabi. 
1361537 | pir| | 164201 helicase (motl) homolog - Mycoplasma geni. 
3482977|emb|CAA20533.l| (AL031369) putative protein [Arabidop. 
3298562 (U91543) zinc-finger helicase [Homo sapiensj 
3875971 | emb|CAB02491 | (Z80344) similar to helicase; cDNA EST . 
4 5 57451 | ref | NP_001263 . 1 | PCHD3 | chromodomain helicase DNA bind. 
264S435 (AF007780) CHD3 (Drosophila melanogaster] 
3875165|emb|CAA91798| (Z67881) Similarity to Mouse Chromodoma. 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 

Query- sid| 114828 | lan | dplORF007 Phage dpi ORF| 22230-23621 1 3 
(463 letters) 

gi|244410S (U88974) ORF26 [Streptococcus thermophilus temperate 

gi| 3318666 (U19754) BBA31 homolog (Borrelia burgdorferi] 

gi | 2690260 (AE000790) conserved hypothetical protein [Borrelia b. 

Query= sid 1 114829 | lan | dplORFOOS Phage dpi 0RF| 49624-50961 1 1 
(445 letters) 

gi|440621O|gb|AAD1990l| (AF100420) DnaB replication fork helicas.. 
gi| 3121983 |sp| 025916 |DNAB_HELPY RE PL I CAT I VE DNA HELICASE >gi|231.. 
gi|4416322|gb|AAD20314| (AF106032) replicative helicase; DnaB [B. 
gi|4155895 (AE001551) REPLICATIVE DNA HELICASE (Helicobacter pyl. 
gi|3322317 (AE001191) replicative DNA helicase (dnaB) [Treponema. 
gi|13803l|sp|P04530|VG41_BPT4 PRIMASE -HELICASE (PROTEIN GP41) >g. 
gi | 2983861 (AE000742) replicative DNA helicase (Aquifex aeolicus) 

Query* s id | 114831 1 lan | dplORFOlO Phage dpi 0RF| 8699-9859 | 2 
(386 letters) 

gi | 2760912 (AF037258) RecA protein (Chlorobium tepidum] 
gi| 3219851 |sp|P94666|RECA_CLOPE RECA PROTEIN >gi|l698591 (U61497.. 
gi|l3S0S66|sp|P48295|RECA~STRVL RECA PROTEIN >gi| 508860 (U04837).. 
gi|744l63|prf j |2014250A recA-like protein [Streptomycea violaceus] 
gi|730487|sp|P41054|RECA_STRAM RECA PROTEIN >gi | 511133 | emb | CAA82 . . 
gi|2687334|emb|CAA15875| (AL020958) RecA protein (Streptorayces c. 
gi 1350565) sp ( P48294 | RECA_STRLI RECA PROTEIN >gi| 481482 |pir| | S38 . . 



1011 
346 
339 
300 
276 
250 
250 
50 



171 
160 
159 
157 
153 
146 
143 
143 
143 
140 
136 
136 
131 
129 
128 
120 
120 
120 
118 
118 



89 
59 
56 



68 
67 
65 
60 
58 
S3 
51 



133 
129 
128 
126 
125 
125 
125 



7e-34 
7e-34 
9e-34 
le-33 
le-33 
le-33 
le-33 
le-33 
2e-33 
2e-33 
3e-33 



0.0 

2e-94 

3e-92 

2e-80 

4e-73 

3e-65 

3e-65 

7e-05 



le-41 
3e-38 
6e-38 
2e-37 
2e-36 
4e-34 
3e-33 
4e-33 
4e-33 
2e-32 
3e-31 
4e-31 
2e-29 
4e-29 
9e-29 
2e-26 
2e-26 
2e-26 
le-25 
le-25 



7e-17 
7e-08 
5e-07 



2e-10 
2e-10 
9e-10 
4e-08 
le-07 
3e*06 
le-05 



2e-3Q 
3e-29 
7e-29 
3e-28 
4e-28 
6e-28 
6e-28 
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gi|4645 99|sp|P33 542|RECA_AQUPY RECA PROTEIN >gi | 1086167 | pir | | A55 .. . 123 2e-27 

gi|4l7636|3p|P32725|RECA_RHOSH RECA PROTEIN >gi j 541307 | pir | j S415 .. . 123 2e-27 

gi | 2984348 (AE00077S) recombination protein RecA [Aquifex aeolicus] 123 2e-27 

gi | 32198S4 | sp | P95 846 | RECA_STRRM RECA PROTEIN >gi 1 1729800 | emb | CAA. . . 122 4e-27 

gi|2500086|sp|QS956ojRECA~MYCSM RECA PROTEIN >gi | 1430892 j emb| CAA .. . 122 4e-27 

gi| 1350567 jsp|P48296|RECA~THEAQ RECA PROTEIN >gi j 1072963 j pirj | AS .. . 122 6e-27 

gij 625663 |pir| | JX0292 recA protein - Thermus aquaticua (strain HB8) 121 le-26 

gijll72880|spjp42440|RECA_CAMJE RECA PROTEIN >gi | 2119991 1 pir | 1 14 .. . 120 2e-26 

gi|41546S4 (AE001453) RECA PROTEIN. (Helicobacter pylori J99} 120 2e-26 

gij 1072968|pir| |CS5020 recA protein - Thermus sp >gi | 458472 | dbj | .. . 120 2e-26 

gi j 3219852 j sp| P95469 | RECA_PARDE RECA PROTEIN >gi| 1825468 (U59631... 119. 3e-26 

gij 2507284 jsp|P42445 | RECA_HELPY RECA PROTEIN >gi| 23 13 23 S |gb| AAD0 . . . 119 4e-26 

gi j 1172 8 90 j sp j Q02 350 j RECA_STAAU RECA PROTEIN >gij463285 (L25893) . . . 118 5e-26 

gi|4416209|gbjAAD2026l| (AF094756) RecA protein [Bifidobacterium... 118 5e-26 

gi | 2500084 jspjQ59180 | RECA_BORBU RECA PROTEIN >gijl276443 (U23457... 118 5e-26 

Query* sid| 114832 | lan | dplORFOll Phage dpi ORF| 28017-29096 | 3 
(359 letters) 

gi|2444110 (U88974) ORF31 [Streptococcus thermophilus temperate ... 187 le-46 

gij 3320438 (AF057033) gp348 [Streptococcus thermophilus bacterio. . . 179 2e-44 

gij 479514 | pir | |S34244 hypothetical protein p38 - actinophage VWB. . . 62 8e-09 

Query* sid| 114834) lan |dplORF013 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 



9*. 
9i 

gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



580855|emb|CAA29958| (X06803) dnaZX-like ORF put. DNA polymer. 
118807 jsp|P09122|DP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA. 
98292|pir| |S13786 DNA~directed DNA polymerase (EC 2.7.7.7) II. 
1527142 (U66040) DNA polymerase III gamma subunit [Salmonella. 
2494197] sp| P74 876|DP3X_SALTY DNA POLYMERASE III SUBUNITS GAMM. 
118808 | sp| P06710|DP3X_ECOLI DNA POLYMERASE III SUBUNITS GAMMA. 
4155207 (AE001497) DNA POLYMERASE III SUBUNITS GAMMA AND TAU . 
231384l|gb|AAD07767.l| (AE000584) DNA polymerase III gamma an. 
2583049 (AF025391) DNA polymerase III holoenzyme tau subunit . 
2984127 (AE000759) DNA polymerase III gamma subunit (Aquifex . 
3861390|emb|CAAlS289| (AJ235273) DNA POLYMERASE III SUBUNITS . 
1169397jsp|P43746|DP3X_HAEIN DNA POLYMERASE III SUBUNITS GAMM. 
1293572 (U49738) DNA polymerase III tau homolog DnaX [Cauloba. 
3328753 (AE001306) DNA Pol III Gamma and Tau (Chlamydia trach. 
4376294 |gb|AAD18193 | (AE001589) DNA Polymerase III Gamma and . 
581255 | emb jcAA2817sj (X04487) alternate dnaZX protein (AA 1-6. 
2688379 (AE001151) DNA polymerase III, subunits gamma and tau. 
3323329 (AE001268) DNA polymerase III, subunits gamma and tau. 



Query- sid| 114835 | lan| dplORF014 Phage dpi ORF( S0961-51974 | 3 
(337 letters) 

gi|l346796|sp|P47492|PRIM_MYCGE DNA PRIMASE >gi | 1361496 | pir | | F64 . . 
gi| 740008 | prf | | 2004290A primase [Haemophilus influenzae] 
gijll726i9|spjQ08346|PRIM_HAEIN DNA PRIMASE >gi | 1074033 | pir | | A64 . . 
gij 1709769 jspjQ04505 j PRIM_LACLA DNA PRIMASE >gi j 1075726 j pir j |JC2 . . 
gij 639846 |dbj j BAA03516 | (D14690) DNA primase [Lactococcus lactis] 
Query* sid| 114837 | lan|dplORF016 Phage dpi ORF|43413-44303 | 3 
(296 letters) 



182 
182 
182 
172 
172 
170 
169 
168 
166 
166 
165 
156 
151 
148 
148 
146 
140 
137 



57 
51 
51 
51 
51 



2e-45 
2e-45 
2e-45 
4e-42 
4e-42 
le-41 
2e-41 
4e-41 
3e-40 
3e-40 
5e-40 
2e-37 
8e-36 
4e-35 
5e-35 
3e-34 
2e-32 
le-31 



2e-07 
le-05 
le-OS 
le-OS 
le-OS 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



1934766|emb|CAB07986 | (Z93946) N-acetylmuramoyl-L- alanine ami. 
113676 |sp|P06653|ALYS_STRPN AUTOLYSIN (N-ACETYLMURAMOYL-L -ALA. 
282326 | pir | | A42935 N-acetylmuramoyl-L- alanine amidase (EC 3.5. 
416618 j sp | P32762 | ALYS_BPHB3 LYTIC AMIDASE (N-ACETYLMURAMOYL-L. 
285273 j pir| | A42936 N-acetylmuramoyl-L- alanine amidase (EC 3.5. 
127787 jsp|PlS057|LYCA_BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE) . 
6776l|pir| | MUBPCP N-acetylrauramoyl-L-alanine amidase (EC 3.5. . 
127789 |spjpi9386|LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE). 
928832 (L44593) ORF259; putative (Lactococcus lactis phage BK. 
2511705|emb|CAA71783| (Y10818) sigA binding protein (Streptoc. 
4097980 (U72655) surface protein C [Streptococcus pneumoniae) 
2351768 (U89711) PspA (Streptococcus pneumoniae) 
2425109 (AF019904) choline binding protein A (Streptococcus p. 
282335 | pir | | A41971 surface protein pspA precursor - Streptoco. 
2576331 embjcAA05158| (AJ002054) SpsA protein [Streptococcus . 
2127295 pirj|S57962 cspC protein - Clostridium acetobutylicum. 
2576333 emb j CAA05159 1 (AJ0020.55) SpsA protein [Streptococcus . 
4106522 gb|AAD02874.l| (AF097909) excreted protein FibB [Pept. 
1361406 pir | (S57714 cspB protein - Clostridium acetobutylicum. 
1914872 embjcAB04758| (Z82001) PCPA (Streptococcus pneumoniae) 



661 
221 
219 
212 
212 
162 
162 
160 
119 
111 
107 
105 
104 
104 
103 
85 
84 
83 
82 
81 



0.0 

4e-57 

3e-56 

2e-54 

2e-54 

4e-39 

4e-39 

le-38 

2e-26 

9e-24 

le-22 

4e-22 

6e-22 

le-21 

2e-21 

6e-16 

le-15 

3e-15 

4e-i5 

9e-15 
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gi | 3168594 | dbj | BAA28613 | (AB012763) SpaA (Erysipelothrix rhusiop. 
gij 2292750 j emb | CAA64 94 2 j (X95646) homology to orf259 of lactococ. 
gi j 2935696 (AF032122) putative lysin [Streptococcus thermophilus . 
gi|4S86910|dbj|BAA76540.l| (AB017447) protective antigen SpaA.l . 
gij 3540294 (AF057033) lysin (Streptococcus thermophilus bacterio. 

Query* sid| 114841 | lan| dplORF020 Phage.dpl 0RF| 1864-2658 | 1 
(264 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



2633745|emb|CAB13247| (Z99111) similar to coenzyme PQQ synthe... 
2808502|emb|CAA12532| (AJ225561) ExsD protein [Sinorhizobiura ... 
386115ljembjcAA1505lj (AJ235272) unknown (Rickettsia prowazekii) 
1652793|dbj |BAA17712| (D90908) hypothetical protein (Synechoc. . . 
1723815|sp|P55139|YGCF_ECOLI HYPOTHETICAL 2S.0 KD PROTEIN IN . . . 
2984272 (AE000769) hypothetical protein [Aquifex aeolicus] 
415S435 (AE001516) putative (Helicobacter pylori J99J 
2127833 | pir| |C64505 coenzyme PQQ synthesis protein III homolo. . 
2622338 (AE000890) coenzyme PQQ synthesis protein III (Methan. . 
3257042|dbj |BAA29725| (AP000003) 254aa long hypothetical prot . . 
2314068|gb|AAD07976 .l| (AE000602J conserved hypothetical prot . . 
1723 816 jspjp4 5097 |YGCF_HAEIN HYPOTHETICAL PROTEIN HI1189 >gi|.. 



Query* sid| 114842 | lan| dplORF021 Phage dpi ORF| 2504-3295 | 2 
{263 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



127481 1 sp| P19465 | GCH1_BACSU GTP CYC LOHYDRO LAS E I (GTP-CH-I) > . 
3242 315 |emb|CAA04 237| {AJ000685) GTP cyclohydrolase (Streptoc. 
2494695| sp | Q54769 | GCH1_SYNP7 GTP CYCLOHYDROLASE I (GTP-CH-I) . 
255061 |bbs | 112832 (S44049) GTP cyclohydrolase I {clone hGCH-1. 
4503949|ref | NPJ)00152 . i | PGCH1 | GTP cyclohydrolase 1 (dopa-res. 
2113 967 j emb|CAS08 935 | (Z95557) folE (Mycobacterium tuberculosis] 
1730240 | sp|P50141 | GCH1_CHICK GTP CYCLOHYDROLASE I (GTP-CH-I) 
2494696| sp j Q5S759 | GCH1~SYNY3 GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
121061 | sp| P22288|GCH1_RAT GTP CYCLOHYDROLASE I PRECURSOR (GTP. . 
3183014 | sp | 013774 |GCHl_SCHPO GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
3097224 jemb|CAA18795| 7AL023093) GTP cyclohydrolase I (Mycoba. . 
2494697) sp | Q19980 | GCH1 CAEEL PROBABLE GTP CYCLOHYDROLASE I (G. . 
462167 |sp|Q05915|GCHl_MOUSE GTP CYCLOHYDROLASE I PRECURSOR (G. . 
1669664 | emb|CAA89808| (24 9706) GTP cyclohydrolase I (Dictyost.. 
2981082 (AF052048) GTP-cyclohydrolase (Ostertagia ostertagi] 
31954 |emb|CAA7B908| (Z16418) GTP cyclohydrolase I (Homo sapi. . 
551344|bbs|150280 (S71373) GTP cyclohydrolase I (mice, Peptid. . 
1730247|sp|P5160l|GCHl_YEAST GTP CYCLOHYDROLASE I (GTP-CH-I) 
1246912 jemb|CAA87397| (Z47201) GTP cyclohydrolase 1 (Saccharo. . 
1730246 jsp|P51595|GCHl_STRPN GTP CYCLOHYDROLASE I (GTP-CH-I) .. 
2982951 (AE000680) GTP cyclohydrolase I (Aquifex aeolicus] 



Query* sid| 114843 | lan| dplORF022 Phage dpi ORF| 30896-31675 | 2 
(259 letters) 

gi| 2347102 (U77367) internalin [Listeria monocytogenes] 

gij 3123226 |sp|P25146|lNLA_LISMO INTERNALIN A PRECURSOR >gi|48705. 

gij 149674 (M67471) internalin (Listeria monocytogenes] 

Query* sid| 114850 | lan| dplORF029 Phage dpi ORF| 662-1348 | 2 
(228 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



2650185 
3861231 
2622210 
2983380 
1001327 
212805S 
4155143 
2313760 
2120814 
2633743 
1175543 
2495537 
3256471 
2921156 



(AE001074) succinoglycan biosynthesis regulator (exsB... 
|emb|CAA1513l| (AJ235272) unknown [Rickettsia prowazekii] 
(AE000881) conserved protein (Methanobacterium thermo. . . 
(AE000709) trans -regulatory protein ExsB [Aquifex aeo. . . 
| dbj |BAA10814| (D64006) ExsB (Synechocystis sp.] 
|pir||B64468 hypothetical protein homolog MJ1347 - Met... 
(AE001491) putative [Helicobacter pylori J99) 
gb|AAD07701.l| (AE000578) conserved hypothetical prot . 
pir||S60183 protein ExsB - Rhizobium meliloti >gi|ll4... 
emb|CAB13245| (Z99111) similar to hypothetical protei... 
sp|P44124|YBAX HAEIN HYPOTHETICAL PROTEIN HI1191 >gi|... 
sp|P77756|YBAx"ECOLI HYPOTHETICAL 25.5 KD PROTEIN IN . . . 
dbj |BAA29154.lJ (AP000001) 269aa long hypothetical pr. . . 
(AF022216) aluminum resistance protein (Arthrobacter . . . 



Query* sid| 114855 | lan|dplORF034 Phage dpi ORF| 131-652 | 2 
(173 letters) 

gi| 2633746|emb|CAB13248| (Z99111) similar to hypothetical protei.. 



81 


le-14 


80 


3e-14 


80 


3e-14 


80 


3e-14 


79 


5e-14 


217 


5e-56 


163 


le-39 


82 


6e-15 


76 


3e-13 


70 


2e-ll 


66 


4e-10 


57 


le-07 


55 


Se-07 


54 


9e-07 


53 


2e-06 


52 


6e-Q6 


50 


2e-05 


208 


4e-53 


191 


4e-48 


189 


2e-47 


187 


7e-47 


187 


7e-47 


187 


7e-47 


185 


3e-46 


184 


5e-46 


184 


6e-46 


184 


6e-46 


182 


2e-45 


182 


2e-45 


180 


7e-45 


180 


le-44 


178 


3e-44 


177 


8e-44 


174 


5e-43 


174 


7e-43 


172 


2e-42 


168 


3e-'41 


164 


6e-40 


55 


5e-07 


52 


4e-06 


52 


4e-06 


119 


2e-26 


117 


8e-26 


108 


4e-23 


88 


6e-17 


88 


6e-17 


83 


le-15 


82 


4e-15 


80 


2e-14 


76 


3e-13 


75 


5e-13 


74 


le-12 


71 


5e-12 


67 


le-10 


54 


le-06 


220 


4e-57 
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S3 le-06 



100 
67 
65 
58 
55 
50 
50 



6e-21 
7e-ll 
3e-10 
4e-08 
2e-07 
8e-06 
le-OS 



gi| 4155926 (AE001554) putative (Helicobacter pylori J99) 162 le-39 

gij 2314588 |gb|AAD08456 .l| (AE000642) conserved hypothetical p rot .. . 161 3e-39 

gi| 2983458 (AE000714) hypothetical protein (Aquifex aeolicus] 103 9e-22 

gi|l006604|dbj|BAA10757| (D64005) hypothetical protein [Synechoc . . . .87 6e-17 

gi| 2967529 (U11045) unknown (Buchnera aphidicola] 79 2e-14 

gi|2495654|sp|Q46920| YQCD ECOLI HYPOTHETICAL 32.6 KD PROTEIN IN . . . 69 2e-ll 

gi|ll75604|sp|P441S3|YQCD~HAEIN HYPOTHETICAL PROTEIN HI1291 >gi|... 63 le-09 

gi|3860642|emb|CAA14543| (AJ235270) unknown (Rickettsia prowazekii] 56 le-07 

Query* aid) 114857 | lan| dplORF036 Phage dpi 0RF| 48808-49362 | 1 
(184 letters) 

gi| 1353529 (U38906) ORF12 (Bacteriophage rlt] 

Query= sid| 114859 | lan| dplORF038 Phage dpi ORF| 1350-1871 | 3 
(173 letters) 

gi| 1175542 |sp|P44123|YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi|... 
gi| 2982977 (AE000681) hypothetical protein (Aquifex aeolicus] 
gi| 3860744 | emb | CAA14645 | (AJ235270) unknown (Rickettsia prowazekii] 
gi|2650193 (AE001074) conserved hypothetical protein (Archaeoglo . . . 
gi|3258383|dbj |BAA31066.1| (AP000007) 157aa long hypothetical pr... 
gi| 1001713 |dbj|BAA10550| (D64004) hypothetical protein (Synechoc... 
gi | 4155434 (AE001516) putative [Helicobacter pylori J99] 

Query* sid | 114860 | lan| dplORF039 Phage dpi ORF | 3306-3803 | 3 
(165 letters) 

gi|l922884|emb|CAA68244| (X99978) ORF7; hydophobic protein [Lact . . . 

Query= sid| 114862 | lan| dplORF041 Phage dpi ORF| 8208-8699 | 3 
(163 letters) 

gi| 2522313 (AF012906) dUTPase homolog (Bacillus subtilis] >gi|26.. 
gi|2634150|emb|CAB13650| (Z99113) similar to deoxyuridine S'-tri.. 
gi|3913S46|sp|054134|DUT_STRCO DEOXYURIDINE 5 * -TRIPHOSPHATE NUCL. . 
gi|3913542|sp|O48S00|DUT_BPT5 DEOXYURIDINE 5 1 -TRIPHOSPHATE NUCLE . . 
gi|391354a|sp|068992|DUT_CHLTE DEOXYURIDINE 5 1 -TRIPHOSPHATE NUCL... 

Query- sid| 114867 | lan| dplORF046 Phage dpi ORF| 42774-43202 | 3 
(142 letters) 

gi| l 9 34 764 | emb | CAB0 7 9 84 | (Z93946) hypothetical protein [bacterio.. 

Query* sid| 114901 | lan|dplORF080 Phage dpi ORF| 42490-42759 | 1 
(89 letters) 

gi| 1934763 | emb|CAB07983 | (Z93946) hypothetical protein (bacterio... 147 le-35 

Query* sid | 114 912 | lan |dplORF091 Phage dpi ORF| 43189-43413 | 1 
(74 letters) 



64 Se-10 



108 2e-23 

108 3e-23 

56 2e-07 

52 3e-06 

50 le-05 



287 2e-77 



gi| 1934765|emb|CAB07985| (Z93946) holin (bacteriophage Dp-1] 



63 2e-10 



434 



Table 32 



Sequence of Dpi published by Sheehan and al.. 4731 nucleotides. 



1 tttaaatttc ttgacaaagt Caattcaaat tgtaccgccg aagcaatttt ccatgtattc actcaaagtt 

71 gttcagtgtg gctcaatcat attaaaatcg aacttggtaa tatctctact ccttttagtg aagcagagga 

141 agaccttaaa tatcgaattg actcaaaagc cgatcaaaag ctaactaacc aacagttgac ggcactcacg 

211 gaaaaggctc aactacatga cgcagaactg aaagctaagg ctacaatgga gcagttaagt aacttagaaa 

281 aggcttatga aggtagaatg aaagctaatg aagaagctat caacaaatcg gaacccgacc taatcttagc 

351 ggcaagtcga attgaagcta ctatccaaga acttggcggg ctacgggaac tgaagaagtt cgtcgacagC 

421 tgcatgagct cttctaatca aggtctaatt atcggtaaga acgacggtag ctctaccatt aaggtatcaa 

491 gtgaccgaat ttctatgttc tccgcaggga atgaagttat gtaccttacg caagggttca ttcacatcga 

561 taacgggatc tttacccaat ccattcaagt cggccgattt agaacggaac aatactcgtt taatccagac 

631 atgaacgtga ttcggtatgt aggataagga gaataacatg acaaaattta tcaactcata cggccctctt 

701 cacttgaacc tttacgtcga acaagttagt caggacgtaa cgaacaactc ctcgcgagtt agttggcgag 

771 ctactgtcga ccgcgatgga gcttatcgaa cgtggactta tggaaatatt agtaaccttt ccgtatggtt 

841 aaatggttca agtgttcata gcagtcaccc agactacgac acgtccggcg aagaggtaac gctcgcaagt 

911 ggagaagtga ctgttcctca caatagtgac gggacaaaga caatgtccgt ttgggcttcg tttgacccta 

981 ataacggcgt tcacggaaat atcactatct ctactaatta cactttagac agtattccaa ggtctacaca 

1051 gatttctagt tttgagggaa atcgaaatct aggatcttta catacggtta tctttaaccg aaaagtgaac 

1121 tcttttacgc atcaagtttg gtaccgagtt ttcggtagcg actggataga tttaggtaag aaccatacta 

1191 ctagcgtatc ctttacgccg tcactggact tagcaaggta cttacctaaa tcaagttccg gaacaatgga 

1261 catctgtatt cgaacctata acggaactac gcaaattggt agtgacgtct attcaaacgg atggaggttc 

1331 aacatccccg attcagtacg tcctactttt tcgggcattt ctttagtaga cacgacttca gcggttcgac 

1401 agattttaac agggaacaac ttcctccaaa tcatgtcgaa cattcaagtc aacttcaaca atgcttccgg 

1471 cgcttacgga tccactatcc aagcatttca cgctgagctc gtaggtaaaa accaagctat caacgaaaac 

1541 ggcggcaaat tgggtatgat gaactttaat ggctccgcta ccgtaagagc atgggttaca gacacgcgag 

1611 gaaaacaatc gaacgtccaa gacgtatcta tcaatgttat agaatactat ggaccgtcta tcaatttctc 

1681 cgttcaacgt actcgtcaaa atcctgcaat tatccaagct cttcgaaatg ctaaggtcgc acctataacg 

1751 gtaggaggtc aacagaaaaa catcatgcaa attaccttct ccgtggcgcc gttgaacact actaatttca 

1821 cagaagatag aggttcggcg tcagggacgt tcactactat ttccctactg actaactcgt ccgcgaactt 

1891 agctggtaac tacgggccgg acaagtctta catagttaag gctaaaatcc aagacaggtt cacttcgact 

1961 gaatttagtg ctacggtacc taccgaatca gtagttctta actatgacaa ggacggtcga cttggagttg 

2031 gtaaggttgt agaacaaggg aaggcagggt caattgatgc agcaggtgat atatatgctg gaggtcgaca 

2101 agttcaacag tttcagctca ctgataataa tggagcattg aacaggggtc aatataacga tgttggaata 

2171 agcgtgaaac agagtttaca tggcgaagta acaaatacga ggacaaccct acgggaactc gaggtgaatg 

2241 gggactattt caaaatttct ggttagatag ctggaaaatg gttcaatcct tcattacaat gtcaggaaga 

2311 atgttcatca ggacagcgaa cgatggaaac agctggagac ctaacaagtg gaaagaggtt ctatttaagc 

2381 aagacttcga acagaataat tggcagaaac ttgttcttca aagtgggtgg aaccatcact caacctatgg . 

. 2451 cgacgcattc tattcgaaaa ctcttgacgg catagtatat ttgagaggaa atgtgcataa aggacttatc 

2521 gacaaagagg ctactattgc agtacttcct gaaggattta gaccgaaagt ttcaatgtat cttcaggctc 

2591 tcaataactc atatggaaat gecattctat gtatatacac tgacggaaga cttgtggtga aatcgaatgt 

2661 agataattct tggttaaatt tagacaatgt ctcatttcgt atttaatttg agctgaaatc atgttataat 

2731 attttttaga aaggaggtga gaactatgtt gaaccttaca aaatcgcgcc aaattgtggc agagttcact 

2801 attggacaag gagctgaaaa gaaacttgtc aaaacaacga ttgcgaacat tgatgcaaac gcagtatcaa 

2871 ccgtctctga aactcttcat gacccagact tgtatgctgc gaaccgtcga gaacttcgag ctgacgagca 

2941 aaaacttcgc gaaactcgtt acgcaatcga agatgaaatt aatagctgga gcgggggaaa aaagggggag 

3011 cccggctcta atfaggctgaa taaggaggcg tcaatctatg ccaatgtggc taaacgacac cgcagtcttg 

3081 acgacgatta ttacagcgtg cagcggagtg cttactgtcc tactaaataa gttattcgaa tggaaatcga 

3151 ataaagccaa gagcgtttta gaggatatct ctacaactct tagcactctt aaacagcagg tcgacgggat 

3221 tgaccaaacg acagtagcaa tcaatcacca aaatgacgtc attcaagacg gaactagaaa aattcaacgt 

3291 taccgtcttt atcacgactt aaaaagggaa gtgataacag gctatacaac tctcgaccat tttagagagc 

3361 tctctatttt attcgaaagt tataagaacc ttggcggaaa tggtgaagtt gaagccttgt atgaaaaata 

34 31 caagaaatta ccaattaggg aggaagattt agatgaaact atctaacgaa caatatgacg tagcaaagaa 

3S01 cgtggtaacc gtagtcgttc cagcagcgat tgcactaatt acaggtcttg gagcgttgta tcaatttgac 

3571 actactgcta tcacaggaac cattgcactt cttgcaactt ttgcaggtac tgttctagga gtttctagcc 

3641 gaaactacca aaaggaacaa gaagctcaaa acaatgaggt ggaataatgg gagtcgatat tgaaaaaggc 

3711 gttgcgtgga tgcaggcccg aaagggtcga gtatcttata gcatggactt tcgagacggt cctgatagct 

3781 atgactgctc aagttctatg tactatgctc tccgctcagc cggagcttca agtgctggat gggcagtcaa 

3851 tactgagtac atgcacgcat ggcttattga aaacggttat gaactaatta gtgaaaatgc tccgtgggat 

3921 gctaaacgag gcgacatctt catctgggga cgcaaaggtg ctagcgcagg cgctggaggt catacaggga 

3991 tgttcattga cagtgataac atcattcact gcaactacgc ctacgacgga atttccgtca acgaccacga 

4061 cgagcgttgg tactatgcag gtcaacctta ctactacgtc tatcgcttga ctaacgcaaa tgctcaaccg 

4131 gctgagaaga aacctggctg gcagaaagat gctactggtt tctggtacgc tcgagcaaac ggaacttatc 

4201 caaaagatga gttcgagtat atcgaagaaa acaagtcttg gttctacttt gacgaccaag gctacatgct 

4271 cgctgagaaa tggttgaaac atactgatgg aaattggtat tggttcgacc gtgacggata catggctacg 

4341 tcatggaaac ggattggcga gtcatggtac tacttcaatc gcgatggttc aatggtaacc ggttggatta 

4411 agtattacga taattggtat tattgtgatg ctaccaacgg cgacatgaaa tcgaatgcgt ttatccgtta 

4481 taacgacggc tggtatctac tattaccgga cggacgtctg gcagataaac ctcaattcac cgtagagccg 

4551 gacgggctca ttactgctaa agtttaaaat atagagagga ggaagctctt ttcttaatac tgtttctctt 

4621 aatcccgcaa ggtttcgacc ctgcggggtt tatgtgtcgt gaattactct atttacttat tcgaagactt 

4691 caattataat taaataatca acgagattca taattggagg aatg 
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Table 33 



Streptococcus accession numbers 

gi|5776553|gb|AF02647 1 ,2|AF02647 1 (5776553] 

gi|54 1 0470|gb| AF 1 39890. 1 1 AF1 39890 (54 1 0470] 

gi|54 10468|gb|AF 1 39889. 1 1 AF 1 39889 (54 1 0468] 

gi|5410466|gb|AFl39888.1|AF139888 [5410466] 

gi|5410464|gb|AF139887.1|AF139887 [5410464] 

gi|54 10462|gb| AF 1 39886. 1 |AF 1 39886 [54 1 0462] 

gi|54 1 0460|gb| AF 1 398 85 . 1 1 AF 1 398 85 [54 1 0460] 

gi|5410458|gb|AF139884.1|AF139884 [5410458] 

gi|5410456|gb|AF139883.1|AF139883 [5410456] 

gi|3093394|emb|AJ005697. 1|SPN5697 [3093394] 

gi|5759208|gb|AF171873.1|AF171873 [5759208] 

gi|57583 1 1 |gb|AF 1 62664. 1 |AF 1 62664 [57583 1 1] 

gi|5739313|gb|AF161701.1|AF161701 [5739313] 

gi|57393 1 0|gb| AF 1 6 1 700. 1 |AF 1 6 1 700 [57393 1 0] 

gi|5726354|gb| AF 1 59448. 1 |AF 1 59448 [5726354] 

gi|5726290|gb|AF127143.1|AF127143 [5726290] 

gi|57 1 2666|gb| AF 140784. 1 |AF 140784 [57 1 2666] 

gi|4218525|emb|AJ009639.1|SPAJ9639 [4218525] 

gi|56 1 6524|gb|AF 1 69483. 1 |AF 169483 [56 16524] 

gi|5579395|gb|AF162656.1|AF162656 [5579395] 

gi|5579393|gb|AF162655.1|AF162.655 [5579393] 

gi|5578890|emb| AJ 1 3 1 98 5. 1 |SPN 1 3 1 985 
[5578890] 

gi|5566442|gb|AF167442.1|AF167442 [5566442] 

gi|5459332|emb|AJ243540. 1 |EVE243540 
[5459332] 

gi|5305398|gb|AF0728 11.1 |AF0728 1 1 [5305398] 

gi|529592 l|emb|AJ242698. 1 |SPN242698 
[5295921] 

gi|5295920|emb|AJ242697. 1 |SPN242697 
[5295920] 

gi|52959 1 9|cmb|AJ242696. 1 |SPN242696 
[5295919] 

gi|52959 1 8|emb|AJ242695. 1 |SPN242695 
[5295918] 

gi|4583522|gb|AF 140356. 1|AF140356 [4583522] 
gi|523 1 206|gb|AF 1 57826. 1 |AF 1 57826 [523 1 206] 
gi|523 1 203|gb|AF 1 57825 . 1 |AF 1 57825 [523 1 203 ] 



gi|523 1 200|gb|AF 1 57824. 1 1 AF 1 57824 (5231 200] 

gi|5231 197|gb|AFl 57823. 1|AF 157823 [5231 197] 

gi|523 1 194|gb|AF 157822. 1|AF 157822 [5231 194] 

gi|5231191|gb|AF157821.1|AF157821 [5231191] 

gi|523 1 1 88|gb|AFl 57820. 1 |AF 1 57820 [523 1 1 88] 

gi|5231 185|gb|AFl57819.1|AF157819 [5231 185] 

gi|5231182|gb|AF157818.1|AF157818 [5231182] 

gi|523 1 1 79|gb|AFl 578 1 7. 1 |AF 1 578 1 7 [523 1 179] 

gi|433685 1 |gb|AFl 06 1 38. 1 |AF 106 1 38 (433685 1] 

gi|4336848|gb|AF 1 06 1 37. 1 |AF 106 1 37 [4336848] 

gi|4336845|gb|AF 1 06 136. 1 1 AF 1 06 1 36 [4336845] 

gi|4336842|gb|AF106135.1|AF106135 [4336842] 

gi|4336839|gb|AF106 1 34. 1 |AF 106134 [4336839] 

gi|4336836|gb|AF106133.1|AF106133 [4336836] 

gi|4336833|gb|AFl 06 1 32. 1 |AF 1 06 1 32 [4336833] 

gi|3907597|gb|AF094575. 1|AF094575 [3907597] 

gi|5030425|gb|AF061748.2|AF061748 [5030425] 

gi|490288 1 |emb| AJ239004. 1 |SPN239004 
[4902881] 

gi|5001710|gb|AF112358.1|AFU2358 [5001710] 

gi|500 1 690|gb|AF 1 06539. 1 |AF 1 06539 [500 1 690] 

gi|497327 1 |gb| AF 1 44420. 1 1 AF 144420 (4973 27 1 ] 

gi|4973269|gb|AF1444 19. 1 |AF 1444 19 [4973269] 

gi|4973267|gb|AF 1444 1 8. 1 1 AF 1444 1 8 [4973267] 

gi|4928 1 90|gb| AF 1 29757 . 1 |AF 1 29757 (4928 1 90] 

gi|4927743|gb| AF 12606 1 . 1 |AF 1 2606 1 [4927743] 

gi|4927742|gb|AF 1 26060. 1 1 AF 1 26060 [4927742] 

gi|492774 1 |gb| AF 126059. 1 |AF 1 26059 (492774 1} 

gi|4495247|emb|AJ240675. 1 |SPN240675 
[4495247] 

gi|4495245|emb|AJ240670. 1 |SPN240670 
[4495245] 

gi|4495243|emb|AJ240669. 1 |SPN240669 
[4495243] 

gi|449524 1 |emb|AJ240668. 1 |SPN240668 
[4495241] 

gi|4495239|emb|AJ240667. 1 |SPN240667 
[4495239] 



gi|4495237|emb|AJ240666. 1 |SPN240666 
[4495237] 

gi|4495235|emb|AJ240665. 1 |SPN240665 
[4495235] 

gi|4495233|emb|AJ240664.1|SPN240664 
[4495233] 

gi|449523 1 |emb| A J240663. 1 |SPN240663 
[4495231] 

gi|4495229|emb|AJ240662.1|SPN240662 
[4495229] 

gi|4495227|emb|AJ24066 1 . 1 |SPN24066 1 
[4495227] 

gi|4495225|emb|AJ240660. 1 |SPN240660 
[4495225] 

gi|4495223|emb|AJ240659. 1 |SPN240659 
[4495223] 

gi|449522 1 |emb|AJ240658. 1 |SPN240658 
[4495221] 

gi|44952 19|emb|AJ240657. 1|SPN240657 
[4495219] 

gi|44952 17|emb|AJ240656. 1|SPN240656 
[4495217] 

gi|44952 l5|emb|AJ240655. 1 |SPN240655 
[4495215] 

gi|44952 13|emb|AJ240654. 1|SPN240654 
[4495213] 

gi|44952 1 1 |emb|AJ240653. 1|SPN240653 
[4495211] 

gi|4495209|emb|AJ240652. 1 |SPN240652 
[4495209] 

gi|4495207|emb|AJ24065 1 . 1 |SPN24065 1 
[4495207] 

gi|4495205|emb|AJ240650.1|SPN240650 
[4495205] 

gi|4495203|emb|AJ240649. 1 |SPN240649 
[4495203] 

gi|4495201|emb|AJ240648.1|SPN240648 
[4495201] 

gi|4495 199|emb|AJ240647. 1 |SPN240647 
[4495199] 

gi|4495 1 97|emb|AJ240644. 1 |SPN240644 
[4495197] 

gi|4495 1 95|emb|AJ240643.1 |SPN240643 
[4495195] 

gi|4495 1 93|emb| A J240642. 1 |SPN240642 
[4495193] 

gi|4495 1 9 1 |emb| AJ24064 1 . 1 |SPN24064 1 
[4495191] 
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gi|4495 1 89|emb| AJ240640. 1 |SPN240640 
[4495189] 

gi|4495 1 87|emb| AJ240639. 1 |SPN240639 
[4495187] 

gi|4495 185|emb|AJ240638. 1 |SPN240638 
[4495185] 

gi|4495 1 83|cmb| AJ240637. 1 |SPN240637 
[4495183] 

gi|4495 1 8 l|emb|AJ240636. 1 |SPN240636 
[4495181] 

gi|4495l79|emb|AJ240635.1|SPN240635 
[4495179] 

gi|4495 177|emb|AJ240634. 1 |SPN240634 
[4495177] 

gi|4495175|emb|AJ240633.1|SPN240633 
[4495175] 

gi|4495.173|emb|AJ240630.1|SPN240630 
[4495173] 

gi|4495 1 7 1 |emb|AJ240629. 1 |SPN240629 
[4495171] 

gi|4495 1 69|emb| AJ240628. 1 |SPN240628 
[4495169] 

gi|4495 1 67|emb| AJ240627. 1 |SPN240627 
[4495167] 

gi|4495165|emb|AJ240626.1|SPN240626 
[4495165] 

gi|4495163|emb|AJ240625.1|SPN240625 
[4495163] 

gi|4495161|emb|AJ240624.1|SPN240624 
[4495161] 

gi|4495159|emb|AJ240623.1|SPN240623 
[4495159] 

gi|4495 1 57|emb|AJ240622. 1 |SPN240622 
[4495157] 

gi|4495 1 55|emb| AJ24062 1 . 1 |SPN24062 1 
[4495155] 

gi|4495 1 53|emb|AJ240620. 1|SPN240620 
[4495153] 

gi|4495 1 5 1 |emb| A J2406 19.1 |SPN2406 1 9 
[4495151] 

gi|4495 1 49|emb| A J2406 16.1 |SPN2406 1 6 
[4495149] 

gi|4495 1 47|cmb|AJ2406 15.1 |SPN2406 1 5 
[4495147] 

gi|4495145|cmb|AJ240614.1|SPN2406l4 
[4495145] 

gi|4495 143|emb|AJ24061 3. 1|SPN240613 
[4495143] 



gi|4495 14 1 |emb|AJ2406 12.1 |SPN2406 1 2 
[4495141] 

gi|4495139|emb|AJ24061 1.1|SPN240611 
{4495139] 

gi|4495 1 37|emb| AJ2406 1 0. 1 |SPN2406 1 0 
[4495137] 

gi|4495 135|emb|AJ240609. 1 |SPN240609 
[4495135] 

gi|4495 1 33|emb|AJ240608. 1 |SPN240608 
[4495133] 

gi|4495 1 3 1 |emb| AJ240607. 1 |SPN240607 
[4495131] 

gi|4495 1 29|emb|AJ240606. 1 |SPN240606 
[4495129] 

gi|4883698|gb|AF079807. l|AF079807 [4883698] 

gi|4838562|gb|AF145055.1|AF145055 [4838562] 

gi|4063727|gb|L29324.1|STRINTE [4063727] 

gi|309340 1 |emb|AJ0056 1 9. 1 |SPAJ56 19 [309340 1 ] 

g i|4 1 03 889|gb|AF0293 68. 1 1 AF029368 [4103889] 

gi|2897689|dbj|D63805. 1|D63805 [2897689] 

gi|456677 1 |gb|AF 1 1 774 1 . 1|AF 1 1774 1 [456677 1 ] 

gi|4566768|gb|AFl 17740.1|AF1 17740 [4566768] 

gi|4538836|emb|AJ240793.1|SPN240793 
[4538836] 

gi|4538832|emb|AJ240792.1|SPN240792 
[4538832] 

gi|4538828|emb|AJ24079 1 . 1 |SPN24079 1 
[4538828] 

gi|4538824|emb|AJ240790. 1 |SPN240790 
[4538824] 

gi|453882 1 |emb|AJ240789. 1 |SPN240789 
[4538821] 

gi|45388 1 8|emb|AJ240788. 1 |SPN240788 
[4538818] 

gi|45388 1 5|emb|AJ240787. 1 |SPN240787 
[4538815] 

gi|4538812|cmb|AJ24O786.1|SPN240786 
[4538812] 

gi|4538809|emb|AJ240785. 1 |SPN240785 
[4538809] 

gi|4538806|emb|AJ240784. 1 |SPN240784 
[4538806] , 

gi|4538803|emblA J240783 . 1 |SPN240783 
[4538803] 

gi|4538800|emb|AJ240782.1|SPN240782 
[4538800] 



gi|453 8797|emb|AJ24078 1 . 1 |SPN24078 1 
[4538797] 

gi|4538794|emb|AJ240780.1|SPN240780 
[4538794] 

gi|453879 1 |emb|AJ240779. 1 |SPN240779 
[4538791] 

gi|4538788|emb|AJ240778.1|SPN240778 
[4538788] 

gi|453 8785|emb| AJ240777. 1 |SPN240777 
[4538785] 

gi|4538782|emb|AJ240776.1|SPN240776 
[4538782] 

gi|4538779|emb|AJ240775. 1 |SPN240775 
[4538779] 

gi|4538776|emb|AJ240774.1|SPN240774 
[4538776] 

gi|4538773|emb|AJ240773.1|SPN240773 
[4538773] 

gi|4538770|emb|AJ240772. 1 |SPN240772 
[4538770] 

gi|4538767|emb|AJ24077 1 . 1 |SPN24077 1 
[4538767] 

gi|4538764|emb|AJ240770. 1 |SPN240770 
[4538764] 

gi|453876 1 |emb|AJ240769. 1 |SPN240769 
[4538761] 

gi|4538758|emb|AJ240768. 1 |SPN240768 
[4538758] 

gi|4538755|emb|AJ240767. 1 |SPN240767 
[4538755] 

gi|4538752|emb|AJ240766. 1 |SPN240766 
[4538752] 

gi|4538749|emb|AJ24O765.1|SPN240765 
[4538749] 

gi|4538746|emb|AJ24076 1 . 1 |SPN24076 1 
[4538746] 

gi|4538743|emb|AJ240760.1|SPN240760 
[4538743] 

gi|4538740|emb|AJ240759.1|SPN240759 
[4538740] 

gi|4538737|emb|AJ24O758.1|SPN240758 
[4538737] 

gi|4538734|emb|AJ240757. 1 |SPN240757 
[4538734] 

gi|453873 1 |emb|AJ240756. 1 |SPN240756 
[4538731] 

gi|4538728|emb|AJ240755.1|SPN240755 
[4538728] 



gi|4538725|emb|AJ240754.1|SPN240754 
[4538725] 

gi|4538722|emb|AJ240753.1|SPN240753 
[4538722] 

gi|45387 1 9|emb|AJ240752. 1 |SPN240752 
[4538719] 

gi|45387 1 6|emb|AJ24075 1 . 1 |SPN24075 1 
[4538716] 

gi|45387 1 3|emb| AJ240750. 1 |SPN240750 
[4538713] 

gi|45387 1 0|emb| AJ240749. 1 |SPN240749 
[4538710] 

gi|4538707|emb|AJ240748.1|SPN240748 
[4538707] 

gi|4538704|emb|AJ240747. 1 |SPN240747 
[4538704] 

gi|453870 1 |emb|AJ240746. 1 |SPN240746 
[4538701] 

gi|4538698|emb|AJ240745.1|SPN240745 
[4538698] 

gi|4538695|emb|AJ240744. 1|SPN240744 
[4538695] 

gi|4538692|emb|AJ240743.1|SPN240743 
[4538692] 

gi|453 8689|emb|AJ240742. 1 |SPN240742 
[4538689] 

gi|4538686|emb| AJ24074 1 . 1 |SPN24074 1 
[4538686] 

gi|453 8683|emb|AJ240740. 1|SPN240740 
[4538683] 

gi|4538680|erab|AJ240739. 1 |SPN240739 
[4538680] 

gi|4538677|emb|AJ240738.1|SPN240738 
[4538677] 

gi|4530444|gb|AFl 18229.1|AF1 18229 [4530444] 
gi|45 1 92 53 |dbj| ABO 15852.1 1 ABO 15852 [45 19253] 
gi|45 1 925 1 |dbj|ABO 1 585 1 . 1 1 ABO 1 585 1 (45 1925 1 ] 
gi|45 19249|dbj|AB0 1 5850. 1 |ABO 1 5850 [45 19249] 
gi|4519247|dbj|AB015849.1|AB015849 [4519247] 
gi|45 1 9245|dbj|AB01 5848. 1 |AB0 1 5848 [45 19245] 
gi|4519243|dbj|AB015847.1|AB015847 [4519243] 
gi|45 19241|dbj|AB01 5846. 1|AB0 15846 [4519241] 
gi|45 1 9239|dbj|AB0 11210.1 |ABO 1 12 1 0 [45 1 9239] 
gi|45 1 9237|dbj) ABO 1 1 209. 1 (ABO 1 1209 [4519237] 
gi|45 1 9235|dbj|AB0 1 1 208. 1 1 ABO 1 1 208 [45 1 923 5] 



gi|45 19233|dbj|AB0 1 1 207. 1 1 ABO 1 1 207 [45 19233] 

gi|45 1 923 1 |dbj| ABO 1 1 206. 1 1 ABO 1 1 206 (45 1 923 1 ] 

gi|4519229|dbj|AB01 1205.1|AB01 1205 [4519229] 

gi|45 1 9227|dbj| ABO 1 1 204. 1 |ABO 1 1 204 [45 1 9227] 

gi|4519225|dbj|AB011203.1|AB011203 [4519225] 

gi|4519223|dbj|AB01 1202.1|AB01 1202 [4519223] 

gi|4519221|dbj|AB01 1201. 1|AB01 1201 [4519221] 

gi|45 192 1 9|dbj|AB0 1 1 200. 1 |ABO 1 1 200 (45 192 1 9] 

gi|45192l7|dbj|AB01 1 199. 1|AB01 1199 [4519217] 

gi|4519215|dbj|AB0U 198.1|AB01 1198 [4519215] 

gi|4495127|emb|AJ240605.l|SPN240605 
[4495127] 

gi|446803 1 |emb|AJl 32957. 1 |SPN1 32957 
[4468031] 

gi|4468029|emb|AJ 1 32956. 1 |SPN1 32956 
[4468029] 

gi|42 1 85 32|emb| AJO 1 03 1 2. 1 |SPN0 1 03 1 2 
[4218532] 

gi|4456852|emb|AJ236792.1|SPN236792 
[4456852] 

gi|4456850|emb|AJ23679 1 . 1 |SPN23679 1 
[4456850] 

gi|4456848|emb|AJ236790. 1 |SPN236790 
[4456848] 

gi|4456846|emb|AJ236789. 1 |SPN236789 
[4456846] 

gi|3550644|emb|AJ006987.l|SPAJ6987 [3550644] 

gi|3550625|emb|AJ006986.1|SPAJ6986 [3550625] 

gi|44 165 1 8|gbjAF014458.2|AF0 14458 [44 1 65 1 8] 

gi|44062 60|gb|AF 1 05 1 16. 1 |AF 1 05 1 1 6 [4406260] 

gi|4406257|gb| AF 1 05 1 1 5. 1 |AF 1 05 1 1 5 [4406257] 

gi|4406254|gb|AF1051 14.1|AF1051 14 [4406254] 

gi|4406246|gb|AFl 05 1 13. 1|AF 105 1 13 [4406246] 

gi|4406243 |gb| AF 1 05 1 1 2. 1 1 AF 1 05 1 12 [4406243] 

gi|4 1 38533|emb|AJ0058 15.1 |SPN58 15 [4138533] 

gi|3821726|emb|AJ232433.1|SPN232433 
[3821726] 

gi|3821724|emb|AJ232432.1|SPN232432 
[3821724] 

gi|382 1 722|emb|AJ23243 1 . 1|SPN23243 1 
[3821722] 

gi|382 1 720|emb|AJ232430. 1|SPN232430 
[3821720] 
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gi|382 1 7 1 8|emb|AJ232429. 1 |SPN232429 


gi|3821670|emb|AJ232405.1|SPN232405 


(Jozi /loj 


[3821670] 


gi|382 1 7 1 6|emb|AJ232428. 1 |SPN232428 


gi|3821668|emb|AJ232404.1|SPN232404 


[Joz l / loj 


[3821668] 


gi|382 1 7 1 4|emb|AJ232427. 1 |SPN232427 


gi|3821666|emb|AJ232403.1|SPN232403 


[io/l /14J 


[3821666] 


gi|382 1 7 1 2|emb| AJ232426. 1 |SPN232426 


gi|3821664|emb|AJ232402.1|SPN232402 


[3821712] 


[3821664] 


gi|382I710|emb|AJ232425.1|SPN232425 


gi|382 1 662|emb|AJ23 240 1 . 1 |SPN23240 1 


[3821710] 


[3821662] 


gi|382 l708|emb|AJ232424. 1 |SPN232424 


gi|3821660|emb|AJ232399.1|SPN232399 


[3821708] 


[3821660] 


gi|3821706|emb|AJ232423.1|SPN232423 


gi|382 1 658|emb|AJ232398. 1|SPN232398 


[3821706] 


[3821658] 


gi|382 1 704|erab|AJ232422. 1 |SPN232422 


gi|3821656|emb|AJ232397.1|SPN232397 


[3821704] 


[3821656] 


gi|382 1 702|emb|AJ23242 1 . 1 |SPN23242 1 


gi|3821654|emb|AJ232396.1|SPN232396 


[3821702] 


[3821654] 


gi|3821700|emb|AJ232420.1|SPN232420 


gi|3821652|emb|AJ232395.1|SPN232395 


[3821700] 


[3821652] 


gi|382 1698|emb|AJ2324 1 9. 1 |SPN2324 1 9 


gi|382 1 650|emb|AJ232394. 1 |SPN232394 


[3821698] 


[3821650] 


gi|382 1 696|emb|AJ2324 18.1 |SPN2324 1 8 


gi|3821648|cmb|AJ232393.1|SPN232393 


[3821696] 


[3821648] 


gi|3 82 l694|emb|AJ2324 1 7. 1|SPN2324 1 7 


gi|382 1 646|emb|AJ232392. 1 |SPN232392 


r-> rt 'i 1 /-nil 

[3821694] 


[3821646] 


gi|3 82 1 692|emb| A J2324 1 6. 1 |SPN2324 1 6 


gi|382 1644|emb| AJ23239 1 . 1 |SPN23239 1 


[3821692] 


[3821644] 


gi|3 82 1 690|emb| AJ2324 1 5. 1 |SPN2324 1 5 


gi|382 1 642|emb|AJ232390. 1 |SPN232390 


[3821690] 


[3821642] 


gi|3821688|emb|AJ232414.l|SPN232414 


gi|3821640|emb|AJ232389.1|SPN232389 


[3821688] 


[3821640] 


gi|382 1 686|emb| AJ2324 1 3. 1 |SPN2324 13 


gi|3821638|emb[AJ232388.1|SPN232388 


[3821686] 


[3821638] 


gi|3821684|emb|AJ232412.1|SPN232412 


gi|382 1 636|cmb|AJ232387. 1 |SPN232387 


[3821684] 


[3821636] 


gi|3821682|emb|AJ2324U.l|SPN232411 


gi|3821634|emb|AJ232386.1|SPN232386 


[3821682] 


[3821634] 


gi|3 82 1680|emb| AJ2324 10. 1 |SPN2324 10 


gi|3821632|emb|AJ232385.1|SPN232385 


[3821680] 


[3821632] 


gi|382 1678|emb|AJ232409. 1 |SPN232409 


gi|382 1 630|emb|AJ232384. 1 |SPN232384 


(3o2lo78j 


[3821630] 


gi|3821676|emb|AJ232408.1|SPN232408 


gi|3821628|emb|AJ232383.1|SPN232383 


[3821676J 


[3821628] 


gi|382 1 674|emb|AJ232407. 1 |SPN232407 


gi|382 1626|emb|AJ232382. 1 |SPN232382 


[3821674] 


[3821626] 


gi|3821672|emb|AJ232406.1|SPN232406 


gi|382 1 624|emb|AJ23238 1 . 1 |SPN23238 1 


[3821672] 


[3821624] 



gi|382l622|emb|AJ232380.1|SPN232380 
[3821622] 

gi|382 1 620|emb|AJ232379. 1|SPN232379 
[3821620] 

gi|3 82 1 6 1 8|emb|A J2323 78 . 1 |SPN23 2378 
[3821618] 

gi|382 1 6 1 6|emb|AJ232377. 1 |SPN232377 
[3821616] 

gi|382 1 6 14|emb|AJ232376. 1 |SPN232376 
[3821614] 

gi|382 1 6 1 2|emb|AJ232375. 1 |SPN232375 
[3821612] 

gi|382 1 6 10|emb|AJ232373. 1 |SPN232373 
[3821610] 

gi|3821608|emb|AJ232372.1|SPN232372 
[3821608] 

gi|382 1 606|emb|AJ23237 1 . 1|SPN23237 1 
[3821606] 

gi|3821604|emb|AJ232370.1|SPN232370 
[3821604] 

gi|3 82 1 602|emb|AJ232369. 1|SPN232369 
[3821602] 

gi|3 82 1 600|emb|AJ232368. 1|SPN232368 
[3821600] 

gi|382l598|emb|AJ232367.1|SPN232367 
[3821598] 

gi|382 1596|emb|AJ232366. 1 |SPN232366 
[3821596] 

gi|3821594|emb|AJ232365.1|SPN232365 
[3821594] 

gi|3820454|emb|AJ007367. 1 |SPN7367 [3820454] 

gi|3821592|cmb|AJ232364.1|SPN232364 
[3821592] 

gi|382 1 590|emb|AJ232363 . 1 |SPN232363 
[3821590] 

gi|3821588|emb|AJ232362.1|SPN232362 
[3821588] 

gi|3821586|emb|AJ232361.1|SPN232361 
[3821586] 

gi|382 1 584|emb|AJ232360. 1|SPN232360 
[3821584] 

gi|382 1 582|emb|AJ232359. 1|SPN232359 
[3821582] 

gi|3821580|emb|AJ232358.1|SPN232358 
[3821580] 

gi|382 1 578|emb|AJ232357. 1 |SPN232357 
[3821578] 



gi|3821576|emb|AJ232356.1|SPN232356 
[3821576] 

gi|3821574|emb|AJ232355.1|SPN232355 
[3821574] 

gi|382 1572|emb|AJ232353. 1 |SPN232353 
[3821572] 

gi|3821570|emb|AJ232352.1|SPN232352 
[3821570] 

gi|382 1 568|emb|AJ23235 1 . 1 |SPN23235 1 
[3821568] 

gi|382 1 566|emb| AJ2323 50. 1 |SPN232350 
[3821566] 

gi|382 1 564|emb|AJ232349. 1 |SPN232349 
[3821564] 

gi|3821562|emb|AJ232348.1|SPN232348 
[3821562] 

gi|382 1 560|emb|AJ232347. 1 |SPN232347 
[3821560] 

gi|382 1 558|emb|AJ232346. 1 |SPN232346 
[3821558] 

gi|382 1 556|emb|AJ232345. 1 |SPN232345 
[3821556] 

gi|382 1 554|emb| AJ232344. 1 |SPN232344 
[3821554] 

gi|382 1 552|emb|AJ232343. 1 |SPN232343 
[3821552] 

gi|3 82 1 550|emb| AJ232342. 1 |SPN232342 
[3821550] 

gi|3 82 1 548|cmb|AJ23234 1 . 1 |SPN23234 1 
[3821548] 

gi|382 1 546|emb|AJ232340. 1 |SPN232340 
[3821546] 

gi|3821544|emb|AJ232339.1|SPN232339 
[3821544] 

gi|3821542|cmb|AJ232338.1|SPN232338 
[3821542] 

gi|3 82 1 540|emb|AJ232337. 1 |SPN232337 
[3821540] 

gi|3821538|emb|AJ232336.1|SPN232336 
[3821538] 

gi|3821536|cmb|AJ232335.1|SPN232335 
[3821536] 

gi|3821534|emb|AJ232334.1|SPN232334 
[3821534] 

gi|382 1 532|emb|AJ232333 . 1 |SPN232333 
[3821532] 

gi|382 1 530|emb|AJ232332. 1 |SPN232332 
[3821530] 
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gi|3821 528|emb|AJ23233 1 . l|SPN23233 1 


gi|382 1480|emb|AJ232306. 1 |SPN232306 




[3821480] 


gi|3821526|emb|AJ232330.i|SPN23233O ' 


gi|382l478|emb|AJ232305.l|SPN232305 


noi i <l/n 


[3821478] 


gi|382 1 524|emb|AJ232329. 1|SPN232329 


gi|3821476|emb|AJ232304.1|SPN232304 


[3821524J 


[3821476] 


gi|382 1 522|emb| AJ232328. 1 |SPN232328 


gi|3821474|emb|AJ232303.1|SPN232303 


fl Oil CT11 

[3821522J 


[3821474] 


gi|382 1 520|emb|AJ232327. 1 |SPN232327 


gi|3821472|emb|AJ232302.1|SPN232302 


n oil ciai 

[3821520] 


[3821472] 


gi|382 15 1 8|emb|AJ232326. 1 |SPN232326 


gi|382 1 470|emb|AJ23230 1 . 1 |SPN23230 1 


[3821518] 


[3821470] 


gi|38215i6|emb|AJ232325.1|SPN232325 


gi|3821468|emb|AJ232300.1|SPN232300 


[3821516] 


[3821468] 


gi|38215 14|emb|AJ232324.1|SPN232324 


gi|3821466|emb|AJ232299.1|SPN232299 


[3821514] 


[3821466] 


gi|382 15 12|emb|AJ232322. 1|SPN232322 


gi|3821464|emb|AJ232298.1|SPN232298 


[3821512] 


[3821464] 


gi|382 15 10|emb|AJ23232 1 . 1|SPN23232 1 


gi|382 1 462|emb|AJ232297. 1 |SPN232297 


[3821510] 


[3821462] 


gi|3821508|emb|AJ232320.l|SPN232320 


gi|3821460|emb|AJ232295. 1 |SPN232295 


[3821508] 


[3821460] 


gi|382 1 506|emb|AJ2323 1 9. 1 |SPN2323 1 9 


gi|3 82 1458|emb|AJ232294. 1 |SPN232294 


[3821506] 


[3821458] 


gi|382 1 504|emb|AJ2323 18.1 |SPN2323 1 8 


gi|3 82 1 456|emb| AJ232293. 1 |SPN232293 


[3821504] 


[3821456] 


gi|3 82 1 5 02|emb|A J2323 1 7. 1 |SPN2323 1 7 


gi|382 1454|emb|AJ232292. 1 |SPN232292 


[3821502] 


[3821454] 


gi|3 82 1 5 00|emb|AJ2323 16.1 |SPN2323 1 6 


gi|3 82 1 452|emb|AJ23229 1 . 1 |SPN23229 1 


[3821500] 


[3821452] 


gi|3 82 1498|emb|AJ2323 1 5.1 |SPN2323 1 5 


gi|3821450|emb|AJ23229O.l|SPN23229O 


[3821498] 


[3821450] 


gi|3821496|emb|AJ232314.1|SPN232314 


gi|382 1448|erab|AJ232289. 1|SPN232289 


[3821496] 


[3821448] 


gi|3 82 1494|emb|AJ2323 13.1 |SPN2323 1 3 


gi|3821446|emb|AJ232288.1|SPN232288 


[3821494] 


[3821446] 


gi|382 1492|emb|AJ2323 12.1 |SPN2323 1 2 


gi|3821444|emb|AJ232287.1|SPN232287 


[3821492] 


f -\ O <*» 1 A A At 

[3821444] 


gi|3821490|emb|AJ23231 1.1|SPN23231 1 


gi|3821442|emb|AJ232286.1|SPN232286 


[3821490] 


[3821442] 


gi|3 82 1488|emb|AJ2323 1 0. 1 |SPN2323 1 0 


gi|3821440|erab|AJ232285.1|SPN232285 


[3821488] 


[3821440] 


gi|382 1486|emb|AJ232309. 1 |SPN232309 


gi|3821438|emb|AJ232284.1|SPN232284 


[3821486] 


[3821438] 


gi|382 1484|emb|AJ232308. 1 |SPN232308 


gi|3 82 1436|emb|AJ232283. 1 |SPN232283 


[3821484] 


[3821436] 


gi|382 1482|emb|AJ232307. 1 |SPN232307 


gi|3821434|emb|AJ232282.1|SPN232282 


[3821482] 


[3821434] 
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gi|3821432|emb|AJ232281.1|SPN232281 
[3821432] 

gi|3 82 1430|emb|AJ232280. 1 |SPN232280 
[3821430] 

gt|3821428|emb|AJ232279.i|SPN232279 
[3821428] 

gi|382 1426|emb|AJ232278. 1 |SPN232278 
[3821426] 

gi|3821424|emb|AJ232276.1|SPN232276 
[3821424] 

gt|3821422|emb|AJ232275.l|SPN232275 
[3821422] 

gi|3821420|emb|AJ232274.1|SPN232274 
[3821420] 

gi|382 141 8|emb|AJ232273. 1 |SPN232273 
[3821418] 

gi|3821416|emb|AJ232272.1|SPN232272 
[3821416] 

gi|382 14 14|emb|AJ23227 1 . 1 |SPN23227 1 
[3821414] 

gi|3821412|emb|AJ232270,l|SPN232270 
[3821412] 

gi|382 1 4 10|emb|AJ232269. 1 |SPN232269 
[3821410] 

gi|3821408|emb|AJ232268.1|SPN232268 
[3821408] 

gi|382 1406|emb|AJ232267.1 |SPN232267 
[3821406] 

gi|3 82 1404|emb|AJ232266. 1 |SPN232266 
[3821404] 

gi|382 1402|emb|AJ232265.1 |SPN232265 
[3821402] 

gi|382 1400|emb|AJ232264.1 |SPN232264 
[3821400] 

gi|382 1 398|emb|AJ232263. 1 (SPN232263 
[3821398] 

gi|3821396|emb|AJ232262.1|SPN232262 
[3821396] 

gi|3 82 1 394|emb|AJ23226 1 . 1 |SPN23226 1 
[3821394] 

gi|3821392|emb|AJ232260.1|SPN232260 
[3821392] 

gi|3 82 1 390|emb|AJ232259. 1 |SPN232259 
[3821390] 

gi|3 82 1 388|emb|AJ232258. 1 |SPN232258 
[3821388] 

gi|3821386|emb|AJ232257.1|SPN232257 
[3821386] 



gi|382 1 384|emb|AJ232256. 1|SPN232256 
[3821384] 

gi|382 1382|emb|AJ232255. 1|SPN232255 
[3821382] 

gi|3821380|emb|AJ232254.1|SPN232254 
[3821380] 

gi|3821378|cmb|AJ232253.1|SPN232253 
[3821378] 

gi|3821376|emb|AJ232252.l|SPN232252 
[3821376] 

gi|382 1 374|emb| AJ23225 1 . 1 |SPN23225 1 
[3821374] 

gi|3821372|emb|AJ232250.1|SPN232250 
[3821372] 

gi|382 1 370|emb| AJ232249. 1 |SPN232249 
[3821370] 

gi|382 1 367|emb|AJ232248. 1 |SPN232248 
[3821367] 

gi|3821365|emb|AJ232247.1|SPN232247 
[3821365] 

gi|3821363|emb|AJ232246.1|SPN232246 
[3821363] 

gi|382 1 3 61 |emb| AJ232245. 1 |SPN232245 
[3821361] 

gi|382 1 359|emb|AJ232244. 1 |SPN232244 
[3821359] 

gi|3821357|emb|AJ232243.1|SPN232243 
[3821357] 

gi|3821355|emb|AJ232241.1|SPN232241 
[3821355] 

gi|2921842|gb|AF047385.1|AF047385 [2921842] 

gi|2909863|gb|AF047696. 1 |AF047696 [2909863] 

gi|4193353|gb|AF055088.1|AF055088 [4193353] 

gi|4 1 85242|gb|AH007276. 1|SEG_SPTNJUNC 
[4185242] 

gi|4 1 8524 1 |gb|AF066797. 1 |SPTNJUNC2 
[4185241] 

gi|4 1 85240|gb|AF066796. 1 |SPTNJUNC1 
[4185240] 

gi|4097979|gb|U72655. 1 |SPU72655 [4097979] 
gi|4063720|gb|L29323 . 1|STRMTR [4063720] 
gi|1657605|gb|U66846.1|SPU66846 [1657605] 
gi| 1 657602|gb|U66845. l]SPU66845 [ 1 657602] 
gi|4009485|gb|AF068903. 1 1 AF068903 [4009485] 
gi|4009477|gb|AF068902. 1 |AF068902 [4009477] 
gi|4009462|gb|AF068901 . 1 |AF06890 1 [4009462] 
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gi|3947767|emb|AJ233896.1|SPN233896 
[3947767] 

gi|3947765|emb|AJ233895.1|SPN233895 
[3947765] 

gi|3947763|emb|AJ233894.1|SPN233894 
[3947763] 

gi|394776 1 |emb|AJ233893. 1 |SPN233893 
[3947761] 

gi|3947759|emb|AJ233892. 1 |SPN233892 
[3947759] 

gi|3947757|emb|AJ23389 1 . 1 |SPN23389 1 
[3947757] 

gi|3947755|emb|AJ233890.1|SPN233890 
[3947755] 

gi|3947753|emb|AJ233889.1|SPN233889 
[3947753] 

gi|394775 1 |emb|AJ233888. 1 |SPN233888 
[3947751] 

gi|3947749|emb|AJ233887.1|SPN233887 
[3947749] 

gi|3947730|emb|AJ233886.1|SPN233886 
[3947730] 

gi|375889 l|emb|Z7 1 552. 1 |SPADCA [375889 1 ] 

gi|38 1 8479|gb|AF057294. 1|AF057294 [38 18479] 

gi|235 1 767|gb|U897 11.1 |SPU897 1 1 {235 1 767] 

gi|3395661|dbj|AB0O6879.1|AB0O6879 [3395661] 

gi|3395659|dbj|AB006878.1|AB006878 [3395659] 

gi|3395657|dbj|AB006877. 1 |AB006877 [3395657] 

gi|3395655|dbj|AB006876.1|AB006876 [3395655] 

gi|3395653|dbj|AB0O6875.1|AB0O6875 [3395653] 

gi|3395651|dbj|AB0O6874.1|AB006874 [3395651] 

gi|3395649|dbj|AB006873. 1 |AB006873 [3395649] 

gi|3395647|dbj|AB0O6872.1|AB0O6872 [3395647] 

gi|3395645|dbj|AB0O687 1 . 1 |AB00687 1 [3395645] 

gi|3395643|dbj|AB006870. 1 |AB006870 [3395643] 

gi|3395641|dbj|AB006869.1|AB006869 [3395641] s 

gi|3395639|dbj|AB006868.1|AB006868 [3395639] 

gi|23 1 5992|gb|U87092. 1 |SPU87092 [23 1 5992] 

gi|2209338|gb|U93576.1|SPU93576 [2209338] 

gi|2 109442|gb|AF000658. MSPDNAARG 
[2109442] 

gi| 1 88 1 538|gb|U09239. 1 |SPU09239 [ 1 88 1 538] 
gi| 1 666904|gb|U762 1 8 . 1 |SPU762 1 8 [ 1 666904] 
gi| 1 6 1 3766|gb|U333 1 5. 1 |SPU333 1 5 ( 1 61 3766] 



gi| 1498294|gb|U4 1735. 1|SPU4 1 735 [1498294] 
gi| 1 2 1 3493 |gb|U47687. 1 |SPU47687 [ 1 2 1 3493] 
gi| 1 1 63 1 09|gb|U43526. 1 |SPU43526 [ 1 1 63 1 09] 
gi|55 600 1 |gb|U 1 5 1 7 1 . 1 |SPU 1 5 1 7 1 [5 5600 1 ] 
gi|455O63|gb|U02920. 1 |SPU02920 [455063] 
gi|784896|gb|L36923 . 1 |STRSTRH [784896] 
gi|332O386|gb|AFO30373.1|AF03O373 [3320386] 
gi|2804772|gb|AF030374. 1 1 AF030374 [2804772] 
gi|2804762|gb| AF030372. 1 1 AF030372 [2804762] 
gi|2804756|gb|AF03037 1 . 1 |AF03037 1 [2804756] 
gi|2804750|gb|AF030370.1|AF030370 [2804750] 
gi|2804745|gb|AF030369.1|AF030369 [2804745] 
gi|2804739|gb|AF030368. 1 1 AF030368 [2804739] 
gi|2804732|gb|AF030367.1|AF030367 [2804732] 
gi|2804726|gb|AF030366. 1 1 AF030366 [2804726] 
gi|2804720|gb|AF030365. 1 1 AF030365 [2804720] 
gi|28047 13|gb|AF030364.1|AF030364 [2804713] 
gi|2804707|gb|AF030363.1|AF030363 [2804707] 
gi|2804701|gb|AF030362.1|AF030362 [2804701] 
gi|2804694|gb|AF030361.1|AF030361 [2804694] 
gi|2804688|gb|AF030360. 1 1 AF03O36O [2804688] 
gi|2804682|gb|AF030359.1|AF030359[2804682] 
gi|3550979|dbj|AB0 1 0387. 1 |AB01 03 87 [3550979] 
gi|2275100|emb|AJ000336.1|SPR6LDH [2275100] 
gi|3551853|gb|AF076029.1|AF076029 [3551853] 
gi|355 1 773|gb|U94770. 1 |SPU94770 [355 1773] 
gi|3550617|emb|AJ004869.1|SPAJ4869 [3550617] 
gi|35 1 3563|gb| AF055727. 1 |AF055727 [35 13563] 
gi|35 13561 |gb| AF055726. 1 1 AF055726 [35 1356 1 ] 
gi|3513559|gb|AF055725.1|AF055725 [3513559] 
gi|35 1 3557|gb|AF055724. 1|AF055724 [35 13557] 
gi|35 13555|gb|AF055723. 1|AF055723 [351 3555] 
gi|35 1 3553|gb|AF055722. 1 1 AF055722 [3513553] 
gi|35 1 3549|gb| AF05572 1 . 1 |AF05572 1 [35 1 3549] 
gi|35 1 3545|gb|AF055720. 1|AF055720 [35 13545] 
gi| 1 9 1 4869|emb|Z8200 1 . 1 |SPZ8200 1 ( 1 9 1 4869] 
gi|291 1421|gb|AF046238.1|AF046238 [291 1421] 
gi|291 1 4 1 9|gb| AF046237. 1 1 AF046237 [291 1419] 
gi|29l 1417|gb|AF046236.1|AF046236 [291 1417] 
gi|29 1 1 4 1 5 |gb| AF04623 5 . 1 1 AF04623 5 [29 1 1 4 1 5] 
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gi|29U413|gb|AF046234.1|AF046234 [291 1413] 
gi|29H411|gb|AF046233.1|AF046233 [2911411] 
gi|2911409|gb|AF046232.1|AF046232 [291 1409] 
gi|29 1 1 407|gb|AF04623 1 . 1 1 AF04623 1 [29 1 1 407] 
gi|29U405|gb|AF046230.1|AF046230 [2911405] 
gi|325860 1 |gb|U40786. 1 |SPU40786 [325860 1] 
gi|32 1 1 756|gb|AF052209. 1 |AF052209 [3211 756] 
gi|32 1 1 752|gb|AF052208. 1 |AF052208 [32 1 1 752] 
gi|32 1 1 747|gb|AF052207. 1 1 AF052207 [321 1747] 
gi|3220 194|gb|AF053 121.1 |AF053 1 2 1 [3220 1 94] 
gi|2766052|emb|Z99863.1|SPZ99863 [2766052] 
gi|2766050|emb|Z99862.1|SPZ99862 [2766050] 
gi|2766048|emb|Z9986 1 . 1 |SPZ9986 1 [2766048] 
gi|2766046|emb|Z99860. 1|SPZ99860 [2766046] 
gi|2766044|emb|Z99859. l|SPZ99859 [2766044] 
. gi|2766042|emb|Z99858.1|SPZ99858 [2766042] 
gi|2766040|emb|Z99857. 1|SPZ99857 [2766040] 
gi|2766038|erab|Z99856. 1|SPZ99856 [2766038] 
gi|2766036|emb|Z99855.1|SPZ99855 [2766036] 
gi|2766034|emb|Z99854. 1|SPZ99854 [2766034] 
gi|2766032|emb|Z99853. 1|SPZ99853 [2766032] 
gi|2766030|emb|Z99852. 1|SPZ99852 [2766030] 
gi|2766028|emb|Z9985 1. 1|SPZ9985 1 [2766028] 
gi|2766026|emb|Z99850.1|SPZ99850 [2766026] 
gi|2766024|emb|Z99849.1|SPZ99849 [2766024] 
gi|2766022|emb|Z99848.1|SPZ99848 [2766022] 
gi|2766020|emb|Z99847. 1 |SPZ99847 [2766020] 
gi|27660 1 8|emb|Z99846. 1|SPZ99846 [27660 1 8] 
gi|27660 l6|emb|Z99845. 1 |SPZ99845 [27660 1 6] 
gi|27660 14|emb|Z99844. 1 |SPZ99844 [27660 14] 
gi|27660 12|emb|Z99843. 1 |SPZ99843 [27660 1 2] 
gi|2766010|emb|Z99842. 1|SPZ99842 [2766010] 
gi|2766008|emb|Z9984 1 . 1|SPZ9984 1 [2766008] 
gi|2766006|emb|Z99840.1|SPZ99840 [2766006] 
gi|2766004|emb|Z99839. 1|SPZ99839 [2766004] 
gi|2766002|emb|Z99838.1|SPZ99838 [2766002] 
, gi|2766000|emb|Z99837.1|SPZ99837 [2766000] 
gi|2765998|emb|Z99828. 1|SPZ99828 [2765998] 
gi|2765996|emb|Z99827. 1|SPZ99827 [2765996] 
gi|2765994|emb|Z99826. 1|SPZ99826 [2765994] 



gi|2765992|emb|Z99825.1|SPZ99825 [2765992] 

gi|2765990|cmb|Z99824.1|SPZ99824 [2765990] 

gi|2765988|emb|Z99823.1|SPZ99823 [2765988] 

gi|2765986|emb|Z99822.1|SPZ99822 [2765986] 

gi|2765984|emb|Z9982 1 . 1 |SPZ9982 1 [2765984] 

gi|2765982|emb|Z99820. 1|SPZ99820 [2765982] 

gi|2765980|emb|Z99819.1|SPZ99819 [2765980] 

gi|2765978|emb|Z998 18.1 |SPZ998 1 8 [2765978] 

gi|2765976|emb|Z99817.1|SPZ99817 [2765976] 

gi|2765974|emb|Z998 16.1 |SPZ998 16 [2765974] 

gi|2765972|emb|Z998 1 5 . 1 |SPZ998 15 [2765972] 

gi|2765970|cmb|Z998 14.1 |SPZ998 14 [2765970] 

gi|2765968|emb|Z99813.1|SPZ99813 [2765968] 

gi|2765966|emb|Z998 1 2. 1|SPZ998 12 [2765966] 

gi|2765964|emb|Z998 11.1 |SPZ998 1 1 [2765964] 

gi|2765962|emb|Z998 10. 1|SPZ998 10 [2765962] 

gi|2765960|emb|Z99809. 1|SPZ99809 [2765960] 

gi|2765958|emb|Z99808. 1 |SPZ99808 [2765958] 

gi|2765956|emb|Z99807. 1|SPZ99807 [2765956] 

gi|2765954|emb|Z99806. 1 |SPZ99806 [2765954] 

gi|2765952|cmb|Z99805.1|SPZ99805 [2765952] 

gi|2765950|emb|Z99804. 1|SPZ99804 [2765950] 

gi|2765948|emb|Z99803.1|SPZ99803 [2765948] 

gi|2894104|emb|X77249.1|SPR6CIARH [2894104] 

gi|3 1 53 897|gb|AF067 1 28. 1 1 AF067 1 28 [3 1 53 897] 

gi|3 1 527 12|gb|AF065 1 53. 1 |AF065 1 53 [3 1 527 12] 

gi|3 1 527 10|gb|AF065 1 52. 1 |AF065 1 52 [3 1 527 10] 

gi|3 1 52708|gb|AF065 151.1 |AF065 1 5 1 [3 1 52708] 

gi|3 1 1 6426|gb|U843 87. 1 |SPU843 87 [3 1 1 6426] 

gi|2385403|emb|AJ001247. 1 |SP7465RR3 
[2385403] 

gi|2342540|emb|AJ00 1 250. 1 |SP7978RR5 
[2342540] 

gi|2342539|emb|AJ00125 1 . 1 |SP7978RR3 
[2342539] 

gi|2342538|emb|AJ00 1 248. 1 |SP7466RR5 
[2342538] 

gi|2342537|emb|AJ00 1 249. 1 |SP7466RR3 
[2342537] 

gi|3 065 896|gb| AF058920. 1 1 AF05 8920 [3065896] 
gi|2982647|emb|AJ002294. 1|SPAJ2294 [2982647] 
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gi|2982645|emb|AJ002293.1|SPAJ2293 [2982645] 
gi|2982643|emb|AJ002292.1|SPAJ2292 [2982643] 
gi|298264 1 |emb| AJ00229 1 . 1 |SPAJ229 1 [298264 1 ] 
gi| 1 620466|emb|X99400. 1|SPDACA0 [ 1 620466] 
gi|2 1 96665|emb|Z8438 1 . 1 |HSZ843 8 1 [2 1 96665] 
gi|2 1 96663|emb|Z84380. 1 |HSZ843 80 [2 1 96663] 
gi|2 1 9666 1 |emb|Z84379. 1 |HSZ843 79 [2 1 9666 1 ] 
gi|2 l96659|emb|Z84378. 1 |HSZ84378 [2 196659] 
gi|625 1 75|gb|L361 3 1 . 1|STREXP 10A [625 175] 
gi|3004945|gb|AF036624.1|AF036624 [3004945] 
gi|3004943|gb|AF036623. 1|AF036623 [3004943] 
gi|300494 1 |gb| AF036622. 1 |AF036622 [300494 1 ] 
gi|3004939|gb| AF03662 1 . 1 |AF03662 1 [3004939] 
gi|3004937|gb|AF036620. 1|AF036620 [3004937] 
gi|3004935|gb|AF036619. 1|AF036619 [3004935] 
gi|2370572|emb|Z861 12.1|SPZ861 12 [2370572] 
gil2765946|emb|Z99802. 1 |SPZ99802 [2765946] 
gi|2398824|emb|Z34303.1|SPCINREC [2398824] 
gi|2894512|emb|AJ223491.1|SPPPR3 [2894512] 
gi|2 1 9853 9|emb|X85787. 1 |SPCPS 1 4E [2 1 98539] 
gi|2766 1 56|emb|Z999 15.1 |SPZ999 1 5 [2766 1 56] 
gi|2766154|emb|Z99914.1|SPZ99914 [2766154] 
gi|2766 1 52|emb|Z999 13. 1 |SPZ99913 [2766152] 
gi|2766150|emb|Z99912.1|SPZ99912 [2766150] 
gi|2766 148|emb|Z999 1 1 . 1 |SPZ999 1 1 [2766 148] 
gi!2766146|emb|Z99910.1|SPZ99910 [2766146] 
gi|2766 1 44|emb|Z99909. 1 |SPZ99909 [2766 144] 
gi|2766142|emb|Z99908.1|SPZ99908 [2766142] 
gi|2766140|emb|Z99907.1|SPZ99907 [2766140] 
gi|2766138|emb|Z99906.1|SPZ99906 [2766138] 
gi|2766 1 3 6|emb|Z99905 . 1 |SPZ99905 [2766 1 36] 
gi|2766 1 34|cmb|Z99904. 1|SPZ99904 [2766 1 34] 
gj|2766132|emb|Z99903.1|SPZ99903 [2766132] 
gi|2766 1 30|emb|Z99902. 1 |SPZ99902 [2766 1 30] 
gi|2766 1 28|emb|Z99901 . 1 |SPZ9990 1 [2766 128] 
gi|2766126|emb|Z99900.1|SPZ99900 [2766126] 
gi|2766124|emb|Z99899.1|SPZ99899 [2766124] 
gi|2766 1 22|emb|Z99898. 1|SPZ99898 [2766 122] 
gi|2766 1 20|emb|Z99897. 1 |SPZ99897 [2766 120] 
gi|27661 18|emb|Z99896.1iSPZ99896 [27661 18] 



gi|27661 16|emb|Z99895.1|SPZ99895 [27661 16] 
gi|27661 14|emb|Z99894.1|SPZ99894 [27661 14] 
gi|27661 12|emb|Z99893.1|SPZ99893 [27661 12] 
gi|27661 10|emb|Z99892.l|SPZ99892 [27661 10] 
gi|2766 1 08|emb|Z9989 1 . 1 |SPZ9989 1 [2766 1 08] 
gi|2766 1 06|emb|Z99890. 1 |SPZ99890 [2766 1 06] 
gi|2766 1 04|emb|Z99889. 1 |SPZ99889 [2766 1 04] 
gi|2766102|emb|Z99888.1|SPZ99888 [2766102] 
gi|2766 1 00|emb|Z99887. 1 |SPZ99887 [2766 100] 
gi|2766098|emb|Z99886.1|SPZ99886 [2766098] 
gi|2766096|cmb|Z99885.1|SPZ99885 [2766096] 
gi|2766094|emb|Z99884.l|SPZ99884 [2766094] 
gi|2766092|emb|Z99883.1|SPZ99883 [2766092] 
gi|2766090|emb|Z99882.1|SPZ99882 [2766090] 
gi|2766088|emb|Z9988 1 . 1 |SPZ9988 1 [2766088] 
gi|2766086|emb|Z99880. 1 |SPZ99880 [2766086] 
gi|2766084|emb|Z99879.1|SPZ99879 [2766084] 
gi|2766082|emb|Z99878.1|SPZ99878 [2766082] 
gi|2766080|cmb|Z99877. 1 |SPZ99877 [2766080] 
gi|2766078|emb|Z99876.1|SPZ99876 [2766078] 
gi|2766076|emb|Z99875.1|SPZ99875 [2766076] 
gi|2766074|emb|Z99874. 1|SPZ99874 [2766074] 
gi|2766072|emb|Z99873. 1 |SPZ99873 [2766072] 
gi|2766070|emb|Z99872.1|SPZ99872 [2766070] 
gi|2766068|emb|Z9987 1 . 1|SPZ9987 1 [2766068] 
gi|2766066|emb|Z99870. 1 |SPZ99870 [2766066] 
gi|2766064|emb|Z99869. 1|SPZ99869 [2766064] 
gi|2766062|emb|Z99868.1|SPZ99868 [2766062] 
gi|2766060|emb|Z99867.1|SPZ99867 [2766060] 
gi|2766058|emb|Z99866. 1|SPZ99866 [2766058] 
gi|2766056|emb|Z99865.1|SPZ99865 [2766056] 
gi|2766054|emb|Z99864.1|SPZ99864 [2766054] 
gi|2765906|emb|Z99206.1|SPZ99206 [2765906] 
gi|2765904|emb|Z99205.1|SPZ99205 [2765904] 
gi|2765902|emb|Z99204.1|SPZ99204 [2765902] 
gi|2765900|emb|Z99203.1|SPZ99203 [2765900] 
gi|2765898|emb|Z99202. 1|SPZ99202 [2765898] 
gi|2765 896|emb|Z99201 . 1|SPZ9920 1 [2765896] 
gi|2765894|cmb|Z99200. 1|SPZ99200 [2765894] 
gi|2708 63 1 |gb| AF03695 1 . 1 |AF03695 1 (270863 1 ] 



446 



gi|886956|emb|Z49097. 1 |SPCS 1112X [886956] 

gi|2656093|gb|L2 1 856. 1|STRMALR [2656093] 

gi|2576332|emb|AJ002055.1|SPSPSA47 [2576332] 

gi|2576330|emb|AJ002054.1|SPSPSA2 [2576330] 

gi|2511704|emb|Y10818.1|SPY10818 [2511704] 

gi|l 9446 19|emb|Z83335.1|SPZ83335 [1944619] 

gi|2425 1 08 |gb| AFO 1 9904 . 1 1 AFO 1 9904 [2425108] 

gi|2385404|emb|AJ001246.1|SP7465PvR5 
[2385404] 

gi|4382l3|emb|Zl6082.1|PNALIB [438213] 

gi|2 1 496 1 3|gb|U9072 1 . 1 |SPU9072 1 [2 1496 1 3] 

gi|4939 1 |emb|Z2 1 84 1 . 1 |SPPBP2BB [4939 1 ] 

gi|2209207|gb|AF004325. 1|AF004325 [2209207] 

gi|2293061|emb|Z95914.1|SPZ95914 [2293061] 

gi|2276393|gb|U16156.1|SPU16156 [2276393] 

gi|2183314|gb|AF003930.1|AF003930 [2183314] 

gi|2 1 82093|emb|X957 17. 1 |SPPARECGN 
[2182093] 

gi|984230|emb|Z49095.1|SPCSl 1 1 1 A [984230] 

gi|886954|emb|Z49096. 1 |SPCS 1 092X [886954] 

gi| 1 1 8 1 6 1 3|dbj|D82873. 1 |STRPBP2BE [1181613] 

gi| 1 1 8 1 6 1 2|dbj|D8287 1 . 1 |STRPBP2BCZ 
[1181612] 

gi|1181611|dbj|D82870.1|STRPBP2BB2 [1181611] 

gi| 1 181579|dbj|D82869.1|STRPBP2BAl [1 181579] 

gi| 1181 192|dbj|D82872.l|STRPBP2BD [1181 192] 

gi|575595|dbj|D42075. 1 |STRPBP2B2 [575595] 

gi|l 33997 l|dbj|D42074.1|STRPBP2Bl [1339971] 

gi|2108329|emb|Y11463.1|SPDNAGCPO 
[2108329] 

gi| 1944 1 1 5|dbj|AB002522. 1|AB002522 [ 1944 1 1 5] 
gi| 1 666669|emb|Z77727. 1 |SP1S 138 1 C [ 1 666669] 
gi| 1 666668|emb|Z77726. 1|SPIS 138 1 B [1666668] 
gi|1666667|emb|Z77725.1|SPIS1381A [1666667] 
gi| 19 14873|emb|Z82002.1|SPZ82002 [1914873] 
gi| 143 1 584|emb|Z74778. 1|SPDHFR [143 1584] 
gi|47452|emb|Z15120.1|SPSTRG [47452] 
gi|58 1 7 1 7|emb|Z 1 2 1 5 9. 1 |SPCP 1 3 1 G [58 1 7 1 7] 
gi|47342|emb|X 1 7337. 1|SPAMIL0C [47342] 
gi| 1 800300|gb|U83667. 1 |SPU83667 [1 800300] 
gi| 1 532066|emb|Y07780. 1 |SPTET0GEN [1532066] 



gi|l 16 1269|gb|L39074.1|STRSPXB [1161269] 

gi|1460093|emb|X94909.1|SPIGAlPRT [1460093] 

gi|l 750263|gb|U72720. 1 |SPU72720 [ 1 750263] 

gi|298649|gb|S56948.1|S56948 [298649] 

gi|254537|gb|S435 1 1 . 1 |S435 1 1 [254537] 

gi|245227|gb|S81051.1|S81051 [245227] 

gi|2*45226|gb|S81045.1|S81045 [245226] 

gi|245225|gb|S81043.1|S81043 [245225] 

gi|1150618|emb|Z49988.1|SPMMSAGEN 
[1150618] 

gi|47456|emb|X01 138.1|SPTN917A [47456] 

gi| 1 6583 16|emb|Z472 10. 1 |SPDEXCAP [ 1 6583 1 6] 

gi| 1 550802|emb|X95385. 1 |SPGOMCGEN 
[1550802] 

gi|47457|emb|X0U37.1|SPTN917B[47457] 

gi|975714|emb|X90941.1|SPTRJ5251 [975714] 

gi|975713|emb|X90940.1|SPTLJ5251 [975713] 

gi|975709|emb|X90939. 1|SPDNATETM [975709] 

gi|1524346|emb|Z79691.1|SOORFS [1524346] 

gi| 1553054|emb|X98364. 1 |SPPBPHU9 [1553054] 

gi|1553052|emb|X98367.1|SPPBPHU13 [1553052] 

gi|1553050|emb|X98366.1|SPPBPHU12 [1553050] 

gi|1553048|emb|X98365.1|SPPBPHUl 1 [1553048] 

gi| 1 575029|gb|U53509. 1 |SPU53509 [1 575029] 

gi| 1 542968|gb|U49088. 1 |SPU49088 [1 542968] 

gi| 1 542966|gb|U49087. 1 |SPU49087 [ 1 542966] 

gi| 1 53696 1 |emb|Y07845. 1 |SPGYRA [1 536961] 

gi|4739 l|emb|X 1 6367. 1 |SPPBPX [4739 1] 

gi|1490398|emb|Z67739.1iSPPARCETP [1490398] 

gi| 1490395|emb|Z67740, 1|SPGYRB0RF 
[1490395] 

gi| 143 1 589|emb|Z74777. 1 ISPTMRDHFR 
[1431589] 

gi|408 1 45|emb|Z2 1702. 1 |SPUNGMUTX [408 145] 
gi|4746 l|emb|X61025. 1|SPXISINT [4746 1] 
gi|47459|emb|X5565 1 . 1|SPUNGG [47459] 
gi|47454|emb|X52632.1|SPT1545E [47454] 
gi|4742 1 |emb|Zl 7307. 1 |SPRECA [4742 1 ] 
gi|474 1 9|emb|X67873 . 1 |SPP0NA8 [474 1 9] 
gi|474 1 7|emb|X67872. 1|SPP0NA7 [474 1 7] 
gi|474 1 5|cmb|X6787 1 . 1 |SPP0NA6 [474 1 5] 



447 



gi|474 1 3 |emb|X67870. 1 |SPP0NA5 [474 1 3] 
gi|4741 l|emb|X67869.1|SPPONA4 [4741 1] 
gi|47409|cmb|X67867. 1 |SPP0NA2 [47409] 
gi|47407|emb|X67866.1|SPPONAl [47407] . 
gi|47405|emb|X67868.1|SPPNA3 [47405] 
gi|47403|emb|X52474.1|SPPLY [47403] 
gi|984232|emb|X 1 6022. 1|SPPENA [984232] 
gi|517190|emb|X78215.1|SPPBPXG [517190] 
gi|295840|emb|Z22230.1|SPPBP2BBA [295840] 
gi|28898 l|emb|Z22 1 85. 1 |SPPBP2BAC [28898 1] 
gi|288979|emb|Z22 1 84. 1 |SPPBP2BAB [288979] 
gi|288466|emb|Z2 198 1. 1 |SPPBP2BAA [288466] 
gi|49390|emb|Z2 1813. 1 |SPPBP2XD [49390] 
gi|49389|emb|Z2 1812.1 |SPPBP2XC [49389] 
gi|49387|emb|Z2181 L1|SPPBP2BJ [49387] 
gi|49385|emb|Z2 1 8 10.1 |SPPBP2BI [49385] 
gi|49382|emb|Z21808.1|SPPBP2BH [49382] 
gi|49380|emb|Z2 1 807. 1 |SPPBP2BG [49380] 
gi|49379|emb|Z21806.1|SPPBP2BF [49379] 
gi|49377|emb|Z2 1 805 . 1 |SPPBP2BE [49377] 
gi|49376|emb|Z2 1 804. 1 |SPPBP2XB [49376] 
gi|49375|emb|Z21803.1|SPPBP2XA [49375] 
gi|49374|emb|Z2 1 802. 1 |SPPBP2BD [49374] 
gi|49372|emb|Z2 1 80 1 . 1 |SPPBP2BC [49372] 
gi|49369|emb|Z2 1 799. 1 |SPPBP2B A [49369] 
gi|47399|emb|X13 137. 1|SPPENASE [47399] 
gi|47397|emb|X 13 1 36. 1 |SPPENARE [47397] 
gi|1052802|emb|X83917.1|SPGYRBG [1052802] 
gi|587550|emb|X72967. 1 |SPNANA [587550] 
gi|49384|cmb|Z2 1 809. 1 |SPPBP 1 AB [49384] 
gi|4937 1 |emb|Z2 1 800. 1 |SPPBP 1 AA [49371 ] 
gi|984228|emb|Z49094. 1 |SPCS 1091 A [984228] 
gi|47372|emb|X54225. 1 |SPENDA [47372] 
gi|806590|emb|Z49246. 1 |SP667SOD [806590] 
gi|407172|emb|Z2685 l.l|SPATPAS2 [407172] 
gi|407 1 66|emb|Z26850. 1 |SPATPAS 1 [407 1 66] 
gi|47353|emb|X63602.1|SPBOX [47353] 
gi|47348|emb|X05577. 1 |SP APHA3 [47348] 
gi|47337|emb|X65132.1|SP824PBPX [47337] ' 
gi|47335|emb|X65134.1|SP669PBPX [47335] 



gi|4733 1 |cmb|X65 133. 1 |SP577PBPX (4733 1] 
gi|559527|emb|X65 1 36. 1 |SP 1 1 OPBPX [559527] 
gi|3 1 14 1 5|emb|Z22807. 1 |SP 1 6SRNAA [311415] 
gi|47329|emb|X65135.1|SP531PBPX [47329] 
gi|47307|emb|X65 131.1 |SP290PBPX [47307] 
gi|47295|emb|X583 1 2. 1 |SP 1 6SRNA [47295] 
gi|8546 14|emb|Z49 1 09. 1 |SPGADAGN (85461 4] 
gi|556428|gb|L36660. 1 |STRORF 1 [556428] 
gi|51 1062|emb|Z35135.1|SPALIAG [51 1062] 
gi| 1 208737|gb|U47625. 1 |SPU47625 ( 1 208737] 
gi|530O62|gb|U12567.1|SPU12567 [530062] 
gi|153656|gb|M29686.1|STRHEXB [153656] 
gi|153654|gb|M18729.1|STRHEXA [153654] 
gi|153608|gb|M14339.1|STRDPN2A [153608] 
gi|153605|gb|M14340.1|STRDPNlA [153605] 
gi|643543|gb|U20O84.1|SPU20084 [643543] 
gi|64354 1 |gb|U20083.1 |SPU20083 [64354 1 ] 
gi|643539|gb|U20082.1|SPU20082 [643539] 
gi|643537|gb|U20081.1|SPU20081 [643537] 
gi|643535|gb|U20080.1|SPU20080 [643535] 
gi|643533|gb|U20079.1|SPU20O79 [643533] 
gi|643531|gb|U20O78.1|SPU20078 [643531] 
. gi|643529|gb|U20077. 1 |SPU20077 [643529] 
gi|643527|gb|U20076.1|SPU20076 [643527] 
gi|643525|gb|U20075 . 1 |SPU20075 [643525] 
gi|643523|gb|U20074.1|SPU20074 [643523] 
gi|64352 l|gb|U20073. 1 |SPU20073 [64352 1 ] 
gi|6435 1 9|gb|U20072. 1 |SPU20072 [6435 19] 
gi|643 5 1 7|gb|U2007 1 . 1 |SPU2007 1 [6435 1 7] 
gi|6435 1 5|gb|U20070. 1|SPU20070 [6435 15] 
gi|643513|gb|U20069.1|SPU20069 [643513] 
gi|64351 l|gb|U20068.1|SPU20068 [64351 1] 
gi|643509|gb|U20067.1|SPU20067 [643509] 
gi| 10 1 7802|gb|U37560. 1 |SPU37560 [101 7802] 
gi|663277|gb|M36180.1|STRCOMAA [663277] 
gi|437704|gb|L20670. 1 |STRHYALURO [437704] 
gi| 1 53849|gb|L0775 1 . 1 |TRNTN5252R [ 1 53 849] 
gi| 1 53855|gb|M255 19. 1|STRVA1 [1 53855] 
gi| 1 53853|gb|M802 1 5.1 |STRUVS402 A [ 1 53853] 
gi|153848|gb|L07750.1|STRTN5252L [153848] 



gi| 1 53 840|gb|M74 122.1 |STRSURPROA [ 1 53840] 
gi|153796|gb|M60763.1|STRRRNAA [153796] 
gi| 1 5379 1 |gb|M3 1 296. 1 |STRRECP [153791] 
gi|5 16639|gb|L20556.1|SniPLPA [516639] 
gi| 1 53783|gb|M28679. 1 |STRPROMB [ 1 53783] 
gi| 1 53782|gb|M28678. 1 |STRPROMA [ 1 53782] 
gi| 153766|gb|M90527. l|STRPONA [153766] 
gi| 1 53764|gb|J04479. 1 |STRPOLA [ 1 53764] 
gi| 1 53752|gb|M255 15.1 |STRNG4369 [ 1 53752] 
gi| 1 53722|gb|L086 1 1 . l|STRMLTODX [153722] 
gi| 1 53702|gb|J0 1 796. 1|STRMALMXP [153702] 
gi| 1 5370 1 |gb|J0 1 795. 1 |STRMALMX [153701] 
gi| 1 53693|gb|M 13812.1 |STRLYTPN [ 1 53693] 
gi| 1 5369 1 |gb|M 1 77 1 7. 1 |STRLYS [153691] 
gi| 1 53667|gb|M25525. 1|STRKAG73 [153667] 
gi|398102|gb|L20564. 1|STREXP9B [398102] 
gi|398 1 00|gb|L20563. 1 |STREXP9A [398 100] 
gi|398098|gb|L20562. 1 |STREXP8A [398098] 
gi|398096|gb|L2056 1 . 1 |STREXP7A [398096] 
gi|398094|gb|L20560. 1|STREXP6A [398094] 
gi|398092|gb|L20559. 1|STRJEXP5A [398092] 
gi|398090|gb|L20558.1|STREXP4A [398090] 
gi|153626|gb|J04234.1|STREXOA [153626] 
gi|153612|gb|MU226.1|STRDPNM [153612] 
gi|153603|gb|M25521.1|STRDN87669 [153603] 
gi|153601|gb|M25526.1|STRDN87577 [153601] 
gi|153599|gb|M25522.1|STRDN179 [153599] 
gi| 1 53594|gb|M37688. 1|STRDACA [1 53594] 
gi|153582|gb|L07752.1|STRATTB [153582] 
gi|4665 14|gb|L3 1413.1 |STR1RRA [466514] 
gi| 15355 1 |gb|M25520. 1|STR8249 [15355 1] 
gi| 1 53549|gb|M25524. 1 |STR53 1 3972 [153549] 
gi|153547|gb|M25517.1|STR29044 [153547] 
gi| 1 53545|gb|M25523. 1|STR1 8 107 1 [153545] 
gi| 1 5354 1 |gb|M25 518.1 |STR 1 2 1 [153541] 
gi|153539|gb|M25516.1|STRl 10K70 [153539] 
gi|506632|gb|U04047.1|SPU04047 [506632] 
gi|393267|gb|L19055. 1 |STRPAPA [393267] 
gi|442066|gb|S62272. 1 |S62272 [442066] 
gi|295 1 9 1 |gb|L 1 5 1 90. 1 |STRPURIS YN [295 1 9 1 ] 
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Pig. 1A 



Xhol 



BamHI 




PCR of pT0021 with XhoF & BamHNR 



XhoF - S'-AATT CTCGAGTAAAA TAACAT-y 
Xhol 



Hindi D 



AAATCAGGTGACTGT JGAGAAAAGO AGGCGGATCCCG - BamHNR 

f Stop of RBS BamHI 

arsR 



* Eaaa, p Digestion with 


j XhoI&BamHl 




Ligation 




f 



Xhol 



BamHI 



Modified between Sop 
ofarsR to BamHI 




PCR of pT002 1 with LucFFB & LucFFH 



LucFFB - S- CG<X7^7CCATGAGGGG77UCGAAG ACG 



a iri Start of Original Bo/n/ff 

* am/// £*cFF was modified 



s 



Hindll I 



Digestion with 
BamHI&HmdIII 



GAAAGTCCAAATTGTVC4GC77GGG- LucFFH 
Stop of ///^/ 



Modified in the 
vichity of BamHI 

Coning site for ORFs: 
BamHI & Hindlll 

No additional codons 
h the induced protein 



P arsR 

CTCGAG IaTGJ ffGAb AAAAGGAGGCGGATCC pJcT] - 

ATjoI RBS Awi/H 




Hindll I 



LucFF 



- TMQ CTT 



Fig. LB 



Xhol 



BamHI 
SaD. 



Hindm 




Hindi n 



ATG 



hstcodon 





1 — ; 1 


1— 









ORF 



Digestion with 
BamM & Sail 



Digestion with 
BamM & Sail 




SaD 
Hindm 

Hindm 



Verification of pTHA/ORF clones 
by PCR and sequencing 



Fig. 2 



Fig. 3 

A) Functional assay on semi-solid support media 

Frozen stock of phage 77 pTHA/ORF S. aureus RN4220 transformants 

I 

1:10 and 1:100 dilution in saline solution 
5^1 of 1:10 dilution 3 nl of 1:10 and 1:100 dilution 



Streak onto agar plates containing 
0, 2.5, 5, and 7.5nMNaAs02 



Spot onto agar plates containing 
0, 2.5, 5, and 7.5^iMNaAs02 



0/N,37°C 

Compare bacterial growth on plates with and without NaAsC>2 



B) Functional assay in liquid medium 



O/N culture inoculated from frozen stock of 

phage 77 pTHA/ORF S. aureus RN4220 transformants 



1:100 dilution of CYN culture 

| 2h f 37°C,250rpm 

Fresh culture 
| 150^1 

2.5 ml containing 0and5)iM NaAs0 2 
3.5 h, 37°C, 250 rpm 



Measure OD 5 g 5 



1:10 serial dilution from 10" 1 to 10~ 6 

| 20^1 of 10- 4 tol0- 6 
Spot onto agar plate 
| 0/N,37°C 

Count colonies 



A. Inhibition of bacterial growth with individual ORFs of a S. aureus bacteri phage 
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Fig. 4C 



C. Inhibition of bacterial growth with individual ORFs of a S. aureus bacteriophage. 
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